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Worker Attitudes Toward Scheduling of Industrial Music 


Willard A. Kerr 
Tulane University 


It is possible, though not necessarily true, that the average factory 
worker is the best authority on whether or not he should have music when 
he works, how much he should have, how he should have it, and when he 
should have it. At least, his opinions on these topics are important and 
should be investigated. Already it has been demonstrated that factory 
workers want music (2), that music helps certain aspects of morale (1), 
and that music increases net worker output in monotonous operations 
(3). 

Actual programming of music for factory audiences, with special 
reference to the time factor, now usually is done in one of the following 
ways by the plant broadcasting director: 


1. Fatigue Dip Periods. In some factory operations a temporary de- 
cline in output typically appears at about the middle of each half 
of the work spell. Some plants schedule most or all of their music 
programs at these periods of believed fatigue and boredom. 

2. Regular Interval Programs. Many plants set up a regular recorded 
music broadcast schedule which provides for 15, 20, or 30 minutes 
out of every hour of the work shift. 

3. Employee Request Programs. A few plants do not follow a definite 
time schedule, but play records as they are requested by employees. 
In one such plant the music, apparently well received, plays almost 
continuously. 


While advocates of the various methods report favorable results, it seems 
that no attempt has been made to evaluate preferences of workers for 
alternative methods in the time scheduling of music programs. The 
average worker’s opinion is a fact which must be regarded as important 
in evaluating the various methods, because the subjective fatigue-bore- 
dom curve does not necessarily coincide with the familiar daily average 
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hourly production curve, and factors other than fatigue and boredom 
may condition employees’ time desires for music. 

Using the tear method of response described elsewhere (4), a Music 
Timing Ballot was designed and administered to three groups of factory 
employees of the RCA Victor Division, Radio Corporation of America. 
All were accustomed to work to music. These 666 subjects represent a 
group of 79 females and 138 males engaged in coil winding machine opera- 
tions, plus a group of 99 females and 32 males engaged in pressing phono- 
graph records in a Camden, New Jersey, factory, and 291 females and 7 
males engaged in assemblying radio tubes in a Harrison, New Jersey, 
plant. Twenty failed to indicate sex. Average age of the employees in 
the miscellaneous group is 30.6, in phonograph records 37.1, and in radio 
tubes 25.3. The miscellaneous and phonograph record manufacturing 
group heard a combination of the first two methods of programming 
mentioned above while the tubes group experienced the third method. 

Per cent of employees in each of the three groups giving a response to 
each question is indicated in Table 1. The responses to “How much 
music do you want on your work floor?” tend toward bimodality although 
a majority of respondents, except in the miscellaneous group, indicate a 
desire for eight hours of music out of an eight-hour work shift. The 
average worker wants between six and seven hours of music in eight hours 
of work. 

A plurality of workers, if they were to receive three hours of music 
daily, want it divided into sixteen sessions, but the average worker wants 
approximately ten sessions. 

In response to ‘“‘When do you want it?” a distinct tendency appears 
for the two middle hours of each half of the work shift to receive more 
votes than the first, pre-lunch, post-lunch, or closing hour of the shift. 
Music is least desired immediately before and immediately after lunch. 
These subjective reports, probably based on feelings of fatigue and bore- 
dom, are particularly significant in view of the known tendency in many 
factory operations for output to decline temporarily toward the middle 
of each half of the work spell. 

Tetrachoric intercorrelations among the time variables, sex, and age 
for all 666 subjects are shown in Table 2. Items one (how much) and 
two (number of sessions) come nearest of any two items to measuring the 
same thing, that is, a general liking for industrial music and it is not sur- 
prising that the correlation between these two items is .66. Apparently 
morning or afternoon preference for music (Item 3) is not related with 
liking for music (Items 1 and 2). Older employees tend to care siightly 
less for industrial music and females seem to want more of it than do 
males. It is true, however, that the mean age of the males reporting is 





Scheduling of Industrial Music 


Table 1 


Preference of 666 Factory Workers for Arrangement and Timing of Broadcast 
“Music While you Work” 








Per Cent of Employees Giving Each Re- 
sponse to Each of Three Major 
Timing Questions 


224 135 307 666 
Coil Phono Tube 
Winding Pressing Assembly Total 








1. How much music do you want on your work 


Hs rmODMAOaQW pS 


Lunch and rest periods only 

One hour out of eight 

Two hours out of eight 

Three hours out of eight 

Four hours out of eight 

Five hours out of eight 

Six hours out of eight 

Seven hours out of eight............. 
Eight hours out of eight 





. How do you want it? 


A 
B 


C 


D 


E 


F 


moss vowe. 


All in one session in the first half of shift 01.0 
All in one session in the second half of 
00.5 

All in two sessions, one in the first half 

and one in second half of shift 
All in four sessions—one session of music 

in every two hours of work 
All in eight sessions—one session of music 

in every hour of work 
All in sixteen sessions—one session of 

music in every half hour of work... . 
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Table 2 

Tetrachoric Intercorrelations Among Five Items on the Music Timing Ballot 
for 666 Factory Workers 
2 3 5 6 
1. How much .66 .03 .40 —.28 
2. How (sessions) .00 .26 —.27 
3. When (afternoon) — .06 02 
5. Female sex —.59 
6. Age 





significantly higher than that of the females. Also, some older males 
tended to have jobs involving more supervisory responsibilities. These 
latter facts must be considered in interpreting the two following partial 
correlations. Correlation of amount of music desired with sex when age 
if held constant by technique of partial correlation is .30, and a similar 
correlation of amount desired with age when sex is held constant is —.06. 
These results indicate that sex (female) more than age is a determinant 
of how much music a factory worker wants to hear while working; how- 
ever, it-again must be emphasized that sex in itself may be less of a real 
causal factor than the fact that work performed by the average male 
subject in this study is of a less monotonous nature than that performed 
by the average female employee. 


Received May 8, 1944. 
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Analysis of Two Point-Rating Job Evaluation Plans 


R. C. Rogers 
De Laval Steam Turbine Company, Trenton, New Jersey 


The primary aim of any job evaluation system is to provide manage- 
ment with a valid measure of relative job worth upon which to build its 
wage structure. From the standpoint of measurement, the first and most 
important task of job evaluation is the construction of a battery of dis- 
criminative measures which, when properly weighted, will furnish a 
reliable index of the relative value of all jobs in the population being 
analyzed. , 

Underlying most of the existing systems of job evaluation are the as- 
sumptions (a) that the job evaluation plan provides measures of ‘‘job’’ 
characteristics rather than “employee” characteristics, (b) that each of 
the factors provides a discrete measure of some aspect of job worth and 
that they are capable of independent evaluation, (c) that each of the 
factors bears a significant association with the total measure of job worth, 
(d) that each factor is ‘“‘weighted’’ in proportion to its unique contribu- 
tion to the total evaluation, and (e) that the plan includes all or most of 
the significant “common denominators” of job worth.! 

A recent factor analysis of the eleven-factor NEMA method of job 
evaluation (1, 5) by Lawshe and Satter (2) demonstrates that a number 
of the above assumptions are untenable. Their results indicated that 
“most of the variance in total point ratings” could be accounted for by 
one primary factor, “Skill Demands,” which was made up of attributes 
or characteristics possessed by the successful employee. The second 
factor isolated in their analysis, ‘(Job Characteristics,’’ was made up of 
physical characteristics of the job itself ‘‘with which the employee must 
contend.” Their criterion measure, total points, did not have a signifi- 
cant loading on this factor. 

The correlations in their study revealed that certain of the variables, 
e.g., working conditions and physical demand, were not significantly 
associated with total points and that a number of the factors did not 
provide unique measurements. 

This paper presents the results of a similar statistical analysis of two 


1 The assumptions of reliability and validity underlying these systems are not treated 
in this paper and hence are not included in this list. 
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other job evaluation plans adapted for use in the evaluation of wage 
(hourly-rated) and salary jobs (6).? 


The Job Evaluation Plans 


The job evaluation plan for wage (factory) jobs provides for the 
point-rating of the following six factors (numbers in parentheses are the 
maximum point values possible for each factor): Mentality (100); Skill 
(400); Responsibility (100); Mental Application (50); Physical Applica- 
tion (50); and Working Conditions (100). 

The job evaluation plan for the salary (office) jobs provides for the 
point-rating of the following ten factors: Mentality (150); Training (300) ; 
Analytical Ability (300); Initiative (300); Personal Requirements (300) ; 
Executive Responsibility (325); Monetary Responsibility (260); De- 
pendability and Accuracy (65); Mental Application (70); and Physical 
Application (30). 

The “total point” rating for each job is obtained by summating the 
values assigned to the individual factors and adding a constant 400 
“base points” to this total. These total ratings are then translated into 
Job Grades which encompass defined ranges of total point values. 


Results 


The absolute point-values assigned to each job evaluation factor and 
the Job Grade for 170 wage (factory) jobs and 295 salary (office) jobs 
were coded and punched in I.B.M. cards. In addition, the following 
variables taken from the job descriptions (but not included as such in 
the job evaluation plans) were coded and punched: for the wage jobs— 
Learning Time and Educational Requirements; for the salary jobs— 
Learning Time. The two populations of jobs were treated separately. 

Intercorrelations among the variables in the wage job evaluation 
plan are presented in Table 1; for the salary plan in Table 3. These 
matrices were further analyzed by means of Thurstone’s Centroid factor 
analysis technique (4). Factor loadings for the variables in the wage 
plan are given in Table 2; for the salary plan in Table 4. Maximized 
multiple correlation coefficients (3) were computed from these data in 
order to determine the best battery of measures in each plan. 


Discussion 


Throughout the discussion of these results, it is important to bear in 
mind the following major limitations of such a study: (a) To some extent, 


? This study is one of a series conducted in 1944 in connection with a research pro- 
gram aimed at the analysis of existing job evaluation systems in terms of their adequacy 
as measuring instruments. 
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the magnitude of the correlations between the factors in each plan and 
Job Grade (an “internal” criterion) is a function of the a priori weights 
assigned to the factors; (b) The reliability of the assigned point-values is 
unknown; (c) There is no estimate of the statistical validity of these 
measures in terms of an external criterion of job worth. 

Wage Job Evaluation Plan. Considering only the six variables 
included in the wage plan, it is evident (Table 2) that Factor I accounts 
for most of the variance in Job Grade. Following Lawshe and Satter’s 
terminology (2), this factor might be named “Skill Demands.” The 
characteristics having high loadings on Factor I, Skill, Mentality, Mental 
Application and Responsibility, might be taken to represent those char- 
acteristics which the employee must bring to the job in order to perform 
it successfully. 


Table 1 


Intercorrelations— Wage Jobs 
N = 170 








(4) (5) (6) 
(1) Job Grade ‘ . } ‘ 01 





(2) Responsibility 
(3) Skill 

(4) Mentality 

(5) Mental Applic. 
(6) Physical Applic. 


—.17 
—.11 
—.17 
—.17 


(7) Working Condit. 


(8) Learning Time 
(9) Educational Req. 





Table 2 
Factor Loadings—Wage Jobs 








Unrotated 
Il Ill 








= 





(1) Job Grade 

(2) Responsibility 
(3) Skill 

(4) Mentality 

(5) Mental Applic. 
(6) Physical Applic. 
(7) Working Condit. 


(8) Learning Time 
(9) Educational Req. 
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Table 3 
Intercorrelations—Salary Jobs 

N = 295 
(2) (3) (4) ©) (© (7%) (©) @) (ie) (1) (12) 
(1) Job Grade 84 90 92 95 89 .77 87 74 .78 —.05 86 
(2) Mentality — .77 86 81 .70 48 67 69 .79 —.24 77 
(3) Training — 86 8 .72 62 .73 69 .77 —.06 95 
(4) Analytical Abil. — 2 .76 6 .77 .70 .79 —.12 86 
(5) Initiative — 87 69 82 67 75 —.06 83 
(6) Personal Req. — .73 77 55 61 —.01 .70 
(7) Exec. Resp. — .75 44 44 21 62 
(8) Monetary Resp. — 63 .60 .03 .73 
(9) Depend. and Acc. — .71 —.30 .62 
(10) Mental Applic. —, —.32 .73 
(11) Physical Applic. — —03 


(12) Learning Time 











Table 4 
Unrotated Factor Loadings—Salary Jobs 

I II h? 

(1) Job Grade .98 15 .99 
(2) Mentality 88 —.24 83 
(3) Training 92 .08 85 
(4) Analytical Ability 94 —.04 88 
(5) Initiative 94 12 .90 
(6) Personal Requirements 85 .26 .78 
(7) Executive Responsibility 69 49 72 
(8) Monetary Responsibility 85 .25 .78 
(9) Dependability and Accuracy 77 —.29 .68 
(10) Mental Application 83 —.34 81 
(11) Physical Application —.13 57 34 
(12) Learning Time .90 ll 81 











Factor II, with high loadings on Working Conditions and Physical 
Application, was called “Job Characteristics” and reflects those char- 
acteristics inherent in the jobs themselves with which the employee must 
contend. It will be noted that Job Grade does not have a significant 
loading on Factor II. 

None of the measures in this plan has a significant loading on Factor 
III. And, although Learning Time, and Educational Requirements 
show low positive loadings, they too have their highest weights on 
FactorsI and II. The existence of this third factor is probably attribut- 
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able to these two variables neither of which is included in the job evalua- 
tion plan as a separate factor. 

The shrunken multiple correlation with job grade is .97 with only two 
of the variables, Skill and Working Conditions included in the battery. 
Since Skill alone correlates .96 with Job Grade, the increase due to the 
addition of Working Conditions cannot be considered significant. Addi- 
tion of any other variable to this battery actually decreased the magnitude of 
this correlation? The point-ratings assigned to the other job evaluation 
factors therefore contribute nothing to the measurement of relative job 
worth over and above that already assessed by the ratings assigned to 
Skill. It will also be noted that the arbitrary weights assigned to the 
factors do not accurately reflect the magnitude of their association with 
Job Grade. 

Examination of the distribution of ratings for Working Conditions 
and Physical Application indicates that their low correlations with Job 
Grade may be attributed to the lack of spread in the ratings. Since most 
of the factory jobs have approximately the same ratings on these factors 
it would seem advisable that they be “priced” in the job evaluation plan 
as a constant for all jobs, i.e., combined with the standard 400 “base 
points” already assigned to each job. Since the correlations computed 
in this study reflect relationships throughout the entire range, there is 
still the possibility that these factors may be significantly discriminative 
with respect to certain categories of jobs and hence could not be assigned 
a constant point-value for all jobs. Further statistical treatment is 
needed adequately to check this possibility. 

Salary Job Evaluation Plan. In this plan also, most of the variance 
in Job Grade can be accounted for by one primary factor, “Skill De- 
mands” (Table 4), which reflects “employee” characteristics, i.e., 
Initiative, Analytical Ability, Training, etc. 

Factor II, Job Characteristics, composed primarily of Physical Ap- 
plication, again reflects the demands made upon the employee by condi- 
tions inherent in the job. Executive responsibility has a low positive 
loading (.49) on this factor, but its highest loading (.69) is on Factor I, 
Skill Demands. 

The Multiple Correlation with Job Grade is .99 with only three vari- 
ables, Initiative, Training, and Mental Application contributing to the 
multiple. The remaining seven measures therefore contribute little to 
the effectiveness of this evaluation plan. In fact, since Initiative alone 
correlates .95 with Job Grade, it seems questionable whether Training and 
Mental Application contribute enough to the final evaluation to justify 


* That is, addition of another variable to the multiple adds more chance error than 
actual validity and hence decreases the magnitude of the correlation. 
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including them. This is especially true when we consider the time and 
cost of arriving at the final ratings. 

General Considerations. Although on the whole, these plans have 
been employed successfully in the establishment of equitable wage 
structures, and have proved their value from the standpoint of industrial 
relations, their effectiveness as measuring instruments may be seriously 
questioned. The results of this study, as well as those of Lawshe and 
Satter, emphasize the fact that many of the principles and techniques of 
scientific measurement have been neglected in the construction and 
evaluation of these plans. As a result, many of the elaborate multi- 
factored systems currently employed contain a number of components 
which could be dropped from the battery without significantly affecting 
the accuracy of the final evaluation. 

However, it does not seem reasonable to expect that such a complex 
criterion as relative job value can be reliably measured by means of a 
single characteristic as the present studies might indicate. Further in- 
vestigations must be undertaken with a view to developing a reliable 
battery of discriminative measures in which each of the factors is capable 
of making a significant and unique contribution to the total evaluation 
of job worth. 


Summary and Conclusions 


This paper has presented the results of a statistical analysis of two 
point-rating job evaluation plans being employed in a metal machining 
industry for the valuation of wage and salary jobs. An attempt was 
made to analyze these plans from the standpoint of their effectiveness as 
measuring instruments. Within the limitations of the study it may be 
concluded that: 


(1) The present Job Grades in the wage plan could have been deter- 
mined from the point-ratings assigned to the Skill component alone (with 
the possible addition of Working Conditions), and those in the salary 
plan from the ratings assigned to Initiative, Training, and Mental Ap- 
plication. 

(2) In each plan, one primary (centroid) factor, ‘Skill Demands,” 
accounts for most of the variance in Job Grade. This factor is composed 
of those characteristics which the successful employee must bring to the 
job or be capable of developing on the job. 

(3) The second factor in each plan, “Job Characteristics,” is com- 
posed of those characteristics inherent in the job itself with which the 
employee must contend. This factor is not significantly associated with 
Job Grade in either plan. 
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(4) Many of the variables in each plan are not capable of independent 
evaluation in their present form, and the a priori weights which have been 
assigned to them do not accurately reflect the magnitude of their associ- 
ation with Job Grade. 


Received December 21, 1946. 
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The MacQuarrie Test for Mechanical Ability: 
I. Selecting Radio Assembly Operators 


Charles H. Goodman 
Radio Corporation of America 


This is the first of four articles describing some experiments with the 
MacQuarrie Mechanical Ability Test ' in a radio manufacturing company. 
The first of these four articles is concerned with the possible use of the 
MacQuarrie test for selecting radio assembly operators. The second 
article will describe the findings obtained as a result of a follow-up study 
of the 329 subjects tested with the MacQuarrie. The third article will 
set forth the results of a factor analysis of the sub-tests of the MacQuarrie, 
while the fourth article will consist of a motion analysis of the MacQuar- 
rie’s sub-tests. 

The experimental work described in these four papers on the Mac- 
Quarrie test was part of a psychological research program which was 
undertaken for the purpose of finding quick selection methods in hiring 
radio manufacturing workers. The decision to include the MacQuarrie 
test in this research program was based upon the two features of this 
test which are highly desirable in an industrial situation; namely, it is a 
group test, and, secondly, it is a relatively quick measure requiring ap- 
proximately 30 minutes to administer. 


Selective Capacity of the MacQuarrie Test 


Subjects. Three hundred and twenty-nine females, hired by the 
employment office for radio assembly work during the period of Novem- 
ber 1943, to March 1944, served as the subjects of this study. No at- 
tempt was made to select ? the population. The subjects were simply 
the first 329 persons hired during the period mentioned. The ages of the 
subjects ranged from 16 to 64 years with a mean age of 27.3 years and a 
sigma of 10.2 years. Their ages scatter into an inverse J curve with a 
modal value of 112 cases or 34 per cent in the first age interval of 15 to 


1 MacQuarrie, T. W. MacQuarrie test for mechanical ability. Los Angeles: Cali- 
fornia Test Bureau. 

*Some selection has, of course, taken place through the employment interview. 
The factory’s location in a small rural town where the population is quite homogeneous 
has also had, in all probability, some selective influence on the population used in this 
study. 
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19 years. Only 50 subjects, or 15 per cent of the total group, were more 
than forty years of age. All subjects were given the MacQuarrie test 
immediately after they were hired. 

The Job. In order, more fully to comprehend this study, it will be 
helpful to present a brief description of the work performed by an as- 
sembly operator in this factory. The job summary taken from the job 
analysis schedule describes the job as follows: Assembles radio compon- 
ents, such as tube sockets, transformers and capacitors on chassis to 
form a complete set; assembles terminal boards and other small assemb- 
lies using hand tools; mounts subassemblies on chassis and secures them 
in place using nuts and bolts or soldering iron and rosincore solder; re- 
moves insulation from wires using sandpaper or emery cloth, and tins 
stripped leads; may specialize in one phase of assembly details. 

Training. All radio assembly operators are trained for three days in 
the Vestibule Training School before assignment to assembly lines. 
During training the new operators are taught how to solder, crimp (the 
operation of looping wires into terminals or on lugs), and assemble. The 
basic tools the trainees learn to use are the soldering iron, screw driver, 
and pliers. This training is designed to give new operators some famili- 
arity with the tools they must use and some practice on the operations 
they must perform. At the end of two days most operators have ac- 
quired enough skill to use their tools and perform the tasks they have 
been taught.’ 

Criterion. On the third day of training the new operators are given 
a manual test to determine how well they have mastered their instruction. 
The test consists of three different models, A, B, and C, which they must 
construct. Beginning with model A, they are allowed sufficient time to 
construct at least two reproductions of the model and after the specified 
amount of time has elapsed they are told to stop. The same procedure is 
followed in constructing models Band C. Before starting they are urged 
to do their best work. Upon completion of the test they are instructed 
to select one model which they consider their best reproduction of model 
A, one of model B, and one of model C. 

The instructor determines the amount of work done during the test 
by each operator and uses this for a quantitative score. Qualitative 
criteria are then applied to each of the models the operators have selected. 
The following factors are scored: wire dress; length of wire in crimps; 
number of turns in crimp; excess solder; insufficient solder; rosin or cold 
joint; loose joint; neatness; and general appearance of the model. 

Performance of the operator on this test carries the largest amount of 


* Peak efficiency is not reached by new operators nor is it expected of them until they 
have worked for several weeks on the production line. 
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weight in the instructor’s final rating of the new operator. Some con- 
sideration is also given to attitude and progress during training. The 
rating system used in the Vestibule Training School is shown in Table 1. 
These ratings were used as the criterion * of success after the letter grades 
had been converted into numerical values, which are also shown in Table 1. 








Table 1 
Vestibule Training School Rating System 
Letter Numerical 
Grade Equivalent Description of Letter Grade 
E 14 Excellent 
E- 13 Excellent with some reservations 
G+ 12 Very good 
G 11 Good 
G- 10 Below good 
A+ 9 Better than average 
A 8 Average 
A- 7 Average with reservations 
F+ 6 Better than fair 
F 5 Fair 
F- 4 Below fair standards 
P+ 3 Just above poor 
P 2 Poor 
P— 1 Unacceptable 





The writer is aware of the very fine grading used in this rating scale, 
and of the possibility that it was beyond the ability of the training in- 
structor to discriminate so finely. However, no attempt was made to 
change the rating system for two reasons. First, it had been in operation 
in this working situation for some time and the training instructor was 
quite familiar with it, and, secondly, the ratings were heavily dependent 
upon the objective criteria based on the performance tests which have 
previously been described. 

Study of the distribution of the 329 ratings on this fourteen point 
scale shows the modal value of 78 cases (23 per cent) to fall at average or 
its numerical equivalent of 8.0. The calculated mean of the distribution 
is 7.9, making it appear that there is close correspondence between mean 
and mode, and that the distribution is fairly normal. However, there are 


‘Individual production or quality records are not kept for this job, since no one 
operator assembles a complete radio chassis. Foremen rate the workers after six weeks 
of production. However, the use of these foreman ratings as a criterion would have 
involved many raters, while all subjects in this study were rated by the same Vestibule 
Training instructor. 
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two other large peaks in this distribution. Fifty-six cases (17 per cent) 
fall at Good or its numerical equivalent of 11, and 35 cases (11 per cent) 
fall at Fair or its numerical equivalent of 5. It would appear from this 
distribution that the instructor tended to rate heavily on the three 
categories of Fair, Average, and Good. 

A regrouping of the letter grades by the writer by combining P—, P, 
and P+, as the first step, F—, F, and F+, as the second step, A—, A, and 
A+, as the third step, G—, G, and G+, as the fourth step, and E—, and 
E as the fifth step, showed a much more normalized distribution of the 
ratings. It would appear that the training instructor was not able to 
discriminate as finely as the discriminations called for on the rating scale. 
Figure 1 shows the distribution of ratings as made by the Vestibule 
Training Instructor and the distribution of ratings after they had been 
regrouped by the writer. 

Prediction. The adequacy of the MacQuarrie test in selecting as- 
sembly operators was determined by calculating Pearson correlations of 
the subjects’ total test score and sub-test scores with the criterion. These 
correlations are shown in Table 2. 


Table 2 
Correlations of the MacQuarrie Total Test Score and Sub-test Scores with the Criterion 








Total PartI PartII Part III PartIV PartV Part VI Part VII 





r +.42 + .32 +.18 +.13 +.31 +.35 +.32 +.27 
P.E. .032 035 037 .038 035 035 035 037 





Total test score on the MacQuarrie yielded the highest r or +.42 with 
the criterion. Of the sub-tests the highest correlations with the criterion 
were Location +.35, Tracing +.32, and Copying +.31. To obtain the 
optimum yield of the MacQuarrie with the criterion, multiple R’s were 
calculated for various combinations of the sub-tests with the criterion. 
Table 3 shows the multiple R’s that were calculated. The criterion r’s 
for the multiple correlations are shown in Table 2, and the inter-correla- 
tions are shown in Table 4. 

By optimumly combining all of the sub-tests, the multiple R with the 
criterion was +.46, which is a small gain over the highest single r of +.42. 
It is interesting to note that a combination of the four sub-tests of Trac- 
ing, Copying, Location, and Blocks yielded a multiple R of +.44 which is 
slightly larger than the best single r. For quick economical use of the 
MacQuarrie the three sub-tests of Tracing, Location, and Blocks yielded 
a multiple R of +.41 which is almost as large as the r obtained when using 
the total score of all seven sub-tests of the MacQuarrie. 
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Fig. 1. Distribution of vestibule school efficiency ratings of 
329 radio assembly operators. 
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Table 3 
Multiple Correlations of the MacQuarrie Sub-tests with the Criterion 








M arrie 
Sub-Tests 





Tracing 
Tapping 
Dotting 
Copying 
Location 
Blocks 
Pursuit 
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Since age was found to correlate —.22 with the criterion, a multiple 
correlation was computed, using the seven sub-tests of the MacQuarrie 
and age with the criterion. This was done with the thought that since 
age correlated negatively with the criterion it might contribute towards 
an increase in the R. The resultant R, however, showed no increase 
over the best R of +.46 which was obtained when using the seven Mac- 
Quarrie sub-tests with the criterion. The fact that no increase was 
obtained when the negative factor of age was added may be explained 
on the basis of the proportion of “causes’’ influencing the age factor, the 
sub-test factors, and the criterion that are common to all. 

Apparently the “causes” which produce a negative correlation be- 
tween age and the criterion are identical with those that produce the 
negative correlation between age and the sub-tests. Consequently, the 
addition of age into the multiple correlation does not bring any new cor- 
related “causes” to bear on the relationship. 

Age Differences.’ Question arose as to whether there was any relation- 
ship between age and MacQuarrie test scores. To answer this question 
r’s were calculated, and are shown in Table 5. 


Table 4 
Inter-correlations of the Seven MacQuarrie Sub-Tests 











Tracing Tapping Dotting Copying Location Blocks Pursuit 





483 549 437 341 406 425 
407 310 .294 .290 .290 

340 430 320 360 

540 520 .480 

538 437 

459 
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Table 5 
Correlations of the MacQuarrie Total Test Scores and Sub-test Scores with Age 
Total 
Score PartI PartII PartIII PartIV PartV PartVI Part VII 
r — .38 — .34 —.23 — .32 —.21 —.23 —.29 — .33 





All of the correlations are negative and small. However, each r is larger 
than four times its P.E., indicating that these inverse correlations have 
some degree of significance greater than zero. It appears then, that as 
age increases, the MacQuarrie test scores tend in some degree to decrease. 

The question of whether a similar relationship existed for age and 
vestibule training ratings was also answered by correlation. The result 
showed an r of —.22, again an inverse relationship. The P.E. for this r 
is .037, and similarly the correlation is larger than four times its P.E., 
which indicates some degree of significance above zero. It would seem 
then that there is some tendency for the older individuals who enter the 
Vestibule Training School to receive lower merit ratings. 

Superficial inspection of the scattergrams from which these inverse 
r’s were computed might readily have led one to believe that the rela- 
tionships were linear. However, plotting of the regression lines raised 
some question as to the linearity of these relationships. In order to 
determine how significant these trends were, etas were computed. 

Table 6 shows the efas that were obtained, and the chi-square values 
which were calculated to test the linearity of the regression, in order to 
determine whether the curvature might be due to chance deviation from 
linearity. 


Table 6 
Eta Coefficients and Chi-Square Test Values for Linearity 








Variables Eta Chi-Square Significance 





Age and Vestibule Training Rating 41 46.93 1 per cent level 
Age and Total MacQuarrie Score 64 15.90 5 per cent level 
Age and MacQuarrie, Part 1 42 5.10 Not significant 
Age and MacQuarrie, Part 2 .28 6.89 Not significant 
Age and MacQuarrie, Part 3 .34 12.72 Not significant 
Age and MacQuarrie, Part 4 40 15.79 Not significant 
Age and MacQuarrie, Part 5 .00 10.87 Not significant 
Age and MacQuarrie, Part 6 27 7.40 Not significant 
Age and MacQuarrie, Part 7 38 13.20 Not significant 
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Table 6 shows that such curvature as may exist in all but two cases 
may well be due to chance deviation. In the case of age and Vestibule 
Training rating, the chi-square test shows that the curvature is highly 
significant being at the 1 per cent level. This shows a highly significant 
departure from rectilinearity and one would expect similar findings with 
successive samplings. In the second case where a departure from 
rectilinearity is shown, that of age and total MacQuarrie test score, the 
chi-square test shows some significance being at the 5 per cent level. 

In view of the significant inverse r’s that were found, it was an ex- 
pected conclusion that there would be significant differences between the 
mean test scores of the younger and older subjects. Calculation of the 
critical ratios of the mean MacQuarrie sub-test scores showed the younger 
age group to be significantly better than the older age group. These 
findings confirm the inverse correlation evidence, that the younger sub- 
jects of the study do better on the MacQuarrie than the older subjects. 
This was also found to be true for the ratings received by the younger age 
group in Vestibule Training School. A critical ratio of 15.00 was ob- 
tained showing that the younger age group received significantly higher 
ratings than the older group. 

Efficiency of the MacQuarrie Test. In its most optimum weighting, 
the maximum correlation of the MacQuarrie Test for predicting future 
success of assembly radio operators was calculated to be R.=46. An 
evaluation of this R by means of Kelley’s coefficient of alineation would 
indicate the effectiveness of the MacQuarrie in predicting individual 
success of radio assembly operators to be about 12 per cent better than 
simply hiring without the use of this test. 

To the practical plant manager of an industrial concern an increase 
in efficiency of 12 per cent in hiring might not be too readily accepted or 
too encouraging in the light of costs for an aptitude testing program. 

In order to obtain the maximum efficiency of the MacQuarrie test in 
this industrial situation for hiring future applicants it was decided that 
use would be made of the Taylor-Russell * selection ratio tables. Under 
ordinary methods of hiring, the plant superintendent judged that ap- 
proximately 50 per cent of the employees hired were satisfactory. Agree- 
ment was reached with the plant superintendent that in future hiring only 
those individuals who made MacQuarrie test scores that placed them in 
the top 30 per cent of the distribution of those being considered should 
be hired. 

With an R of .46, the selection ratio set at 30 per cent, and using the 
estimate that 50 per cent of the employees hired by the old method of 

* Taylor, H. C., and Russell, J. T. The relationship of validity coefficients to the 


practical effectiveness of tests in selection: Discussion and tables. J. appl. Psychol., 
1939. 28, 565-578. 
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interviewing were satisfactory, the Taylor-Russell Tables indicate that 71 
per cent of those selected should be satisfactory when using the Mac- 
Quarrie test as a selection device. 

The actual testing of this prediction based upon the Taylor-Russell 
selection ratio was never realized. Shortly after the decision had been 
made to use the Taylor-Russell selection ratio, the need for employees 
became so acute, that every applicant seeking work was hired. 

A study in retrospect, however, was made of the application of the 
Taylor-Russell selection ratio, using the same conditions as were to be 
used for future applicants. These conditions were applied to the group 
that was originally hired, and will be described in the second of these 
articles. 


Summary and Conclusions 


This study has presented data based upon the MacQuarrie Mechanical 
Ability scores of 329 female radio assembly operators who were hired 
during the five month period from November 1943 to March 1944 at a 
radio manufacturing company. The purpose of the investigation was to 
determine the usefulness of the MacQuarrie Mechanical Ability Test as a 
selective device for hiring radio assembly operators. 

On the basis of the findings of this study, the following conclusions 
appear to be warranted: 


1. The total test score of the MacQuarrie correlates .42 with the 
criterion. 

2. The MacQuarrie sub-test scores correlate with the criterion as 
follows: Part I, .32; Part II, .18; Part III, .13; Part IV, .31; Part 
V, .35; Part VI, .32; and Part VII, .27. 

3. Part V, the Location test, yields an r with the criterion of .35 
which is only .07 less than the total test score yield with the 
criterion. 

4. A multiple R of the seven sub-tests with the criterion shows that 
when all the sub-tests were used the yield was R .46, which was a 
slight increase over the total test score of r .42. 

5. The use of the four sub-tests of Tracing, Copying, Location and 
Blocks when optimumly weighted in a multiple correlation yields 
an R of .44, which is larger than the total test score correlation. 

6. The sub-tests of Tracing, Location and Blocks properly weighted 
yields an R of .41. These three sub-tests produce results almost 
as large as the seven sub-tests of the MacQuarrie. For economi- 
cal use of the MacQuarrie in this situation it would appear that 
these three sub-tests properly weighted would provide results as 
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good as the total test, plus a saving of time in administration and 
scoring. 

7. The MacQuarrie total test score and sub-test scores were found to 
correlate negatively with age. These negative correlations are 
all significant since they are greater than four times their probable 
errors. 

8. It was also found that the relationship between age and total 
MacQuarrie test score was curvilinear, and that the chi-square 
test showed this relationship to be significant at the 5 per cent 
level. It appears, then, that in this particular situation the older 
subjects tend to make lower scores on the MacQuarrie test. 

9. An inverse relationship was also found between age and the 
criterion of success. The r was found to be —.22 and the rela- 
tionship was definitely curvilinear, the chi-square test showing 
this curvilinear relationship to be highly significant at the 1 per 
cent level. 

10. The efficiency of the MacQuarrie test for selecting radio assembly 
operators as interpreted by Kelley’s coefficient of alienation would 
indicate the test to be 12 per cent more effective than the hiring 
procedure used by the company. 


Received December 31, 1945. 





Correlation Between Scores on Ortho-Rater Tests 
and Clinical Tests * 


C. Jane Davis 
Scientific Bureau, Bausch & Lomb Optical Company 


This investigation was conducted to determine the relationship be- 
tween standard clinical eye tests and the battery of visual skills tests 
given in the Bausch and Lomb Ortho-Rater. With the increasing use 
of the Ortho-Rater as a part of testing procedure in personnel and medical 
departments of industry there has been a growing need for tables permit- 
ting conversion of Ortho-Rater test scores to their clinical equivalents. 


Procedure 


The procedure followed consisted in giving the Ortho-Rater tests to a 
total of 95 subjects. On each of the individuals a battery of clinical test 
results were obtained as a matter of routine procedure in an industrial 
eye clinic.' There was no selection of subjects since all individuals re- 
porting for refraction were included. The group consisted of 32 women 
between the ages of 16 and 57 and 63 men between the ages of 16 and 65. 
Of these, 46 came into the clinic without glasses and were tested with un- 
aided vision only; 42 either wore or carried glasses and were tested first 
with unaided vision and then with their present precsription as worn; and 
7 wearing bifocals with considerable correction in the distance portion 
were unable to take the tests without their correction and were tested with 
glasses only. There were a total of.137 tests run on 95 individuals. In 
all cases both clinical and Ortho-Rater tests were made in each situation. 

Ortho-Rater Tests. Tests on the Ortho-Rater were given in the usual 
sequence with the addition of monocular tests with the unused eye oc- 
cluded for right and left eyes at both testing distances. Order of testing 
was as follows: Distance Tests (Optical Distance of 26 Feet): 1. Vertical 


*This article is a “prior publication,” the author paying complete costs. The 
scheduled 80 pages per issue is thereby increased by the corresponding amount, thus 
the “early publication” of this article is a direct contribution to the subscribers of the 
Journal of Applied Psychology without handicap to those authors whose articles are 
accepted and printed in their regular turn. 

1 The experiment was carried on in the plant of the Bausch and Lomb Optical Com- 
pany, Rochester, New York. The facilities of the industrial eye clinic of the plant 
were used in obtaining clinical results. 
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Phoria; 2. Lateral Phoria; 3. Acuity Both Eyes; 4. Acuity Right Eye 
(without occlusion); 5. Acuity Right Eye (monocular with occlusion) ; 
6. Acuity Left Eye (without occlusion); 7. Acuity Left Eye (monocular 
with occlusion); 8. Depth; and 9. Color. Near Tests (Optical Distance 
of 13 Inches): 1. Acuity Both Eyes; 2. Acuity Right Eye (without oc- 
clusion); 3. Acuity Right Eye (monocular with occlusion); 4. Acuity 
Left Eye (without occlusion); 5. Acuity Left Eye (monocular with oc- 
clusion); 6. Vertical Phoria; and 7. Lateral Phoria. In all acuity tests, 
subjects were started on the number one target. Following routine pro- 
cedure,? standard questions were used, explanations and illustrations 
being supplied where necessary. Every attempt was made to see that 
the subject understood what was expected of him and that he made his 
best possible score in each test. 

Clinical Tests. Clinical tests were given to all the subjects in this 
experiment and they included: 1. Acuity Far: Both Eyes, Right Eye and 
Left Eye; 2. Vertical and Lateral Imbalance (Phoria) Far; 3. Acuity 
Near: Both Eyes, Right Eye and Left Eye; and 4. Vertical and Lateral 
Imbalance (Phoria), Near. 

Tests were given first without glasses; then if glasses were worn or 
carried the tests were repeated with prescription as worn. Distance 
acuity tests were made at 20 feet using a Clason acuity meter. The 
patient was seated in the examining chair without glasses or wearing his 
normal prescription and acuity targets were presented following standard 
clinical procedure. For this series of tests the clinic was asked to find 
threshold acuity in all cases. 

Clinical scores on the Clason have been recorded throughout this 
paper as ten times the reciprocal of the visual angles. This is identical 
with Ortho-Rater scoring and results in a value equal to ten times the 
Snellen decimal. 

Determination of Lateral Phoria, Far, employed one of the horizontal 
lines of letters on the Clason at 20 feet. Two of the testers used a hand 
Risley prism and the third used a Steven’s Phorometer for this measure- 
ment. Doubling of the target line is produced by use of vertical prism 
and the correction in lateral prism required to align the two lines is the 
phoria measurement. Vertical phoria at a distance of 20 feet was 
measured in a similar manner using a vertical line of letters on the Clason. 

Near point acuity tests were made using a reduced Snellen card which 
consists of seven rows of letters reduced for Snellen notations at 20 inches. 
The subject was allowed to hold the test card. This practice was satis- 


* Standard practice in the administration of the Bausch and Lomb occupational vision 
tests with the Ortho-Rater. Bausch and Lomb Optical Company, Rochester, New York, 
1944, 
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factory in producing a normal reading posture but resulted in a somewhat 
varying reading distance since the distance from eye to card was not 
measured for each individual. Since there was no measurement between 
a score of 10 (20/20) and a score of 13 (20/15), the scores tended to pile 
up at 10. 

Lateral and vertical phoria at near were measured in the same manner 
as at distance, using reading card lines as targets. 


Results 


The statistical procedure adopted for quantifying the relationship 
between the Ortho-Rater and clinical scores was the determination of 
Pearson product-moment coefficient of correlation between each of the 
pairs of tests studied. 

Scores on clinical vertical phoria tests approximated a point distri- 
bution at orthophoria. Correlations were not run between these tests 
and the Ortho-Rater tests since prediction cannot be made under these 
circumstances. 

It will be noted in Table 1 that the distance clinical tests correlate 
with the distance Orthor-Rater tests at a higher level, than do the clinical 
near tests and the Ortho-Rater near tests. This is probably due to the 
grossness of the clinical near test for acuity with a pile-up of scores at 
the score of 10. Right and left eye tests on the Ortho-Rater showed a 
higher correlation with clinical tests when the occluder was used over the 


Table 1 
Obtained Coefficients of Correlation 
I. Distance Tests 





r (Without occlusion r (Monocular test 














Test -on Ortho-Rater) on Ortho-Rater) 
Acuity Both Eyes 82 
Acuity Right Eye 67 .76 
Acuity Left Eye 63 82 
Lateral Phoria 53 

II. Near Tests 
r (Without occlusion r (Monocular test 

Test on Ortho-Rater) on Ortho-Rater) 
Acuity Both Eyes 71 
Acuity Right Eye .54 .64 
Acuity Left Eye .67 .70 
Lateral Phoria 64 
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eye not being tested than when the test was given without occlusion. 
In clinical measurement of right and left eye acuity, the tests are neces- 
sarily administered monocularly. For this reason the higher correlation 
demonstrated between clinical scores and monocular scores on the Ortho- 
Rater is to be expected. In general the distance acuity tests demon- 
strated correlations of about .80 while near tests gave correlations of 
about .70.3 - 


Table 2 
Regression Equations for Predicting Clinical Acuity from Ortho-Rater Acuity Scores 
I. Distance Tests 








8.E. of 
Test Equation Estimate 





Acuity, Both Eyes Cl. = 950.-R.+ .2 1.7 
Acuity, Right Eye (without 

occlusion on Ortho-Rater) Cl. = .70 O.-R. + 2.4 2.4 
Acuity, Right Eye (monocular 

with occlusion) Cl. = .980.-R.— .2 2.1 
Acuity, Left Eye (without 

occlusion on Ortho-Rater) . = 58 0.-R. + 3.5 2.4 
Acuity, Left Eye (monocular 

with occlusion) Cl. = .900.-R.+ 6 1.8 





II. Near Tests 





Acuity, Both Eyes Cl. = .85 O.-R. + 1.6 1.8 
Acuity, Right Eye (without 

occlusion on Ortho-Rater) Cl. = .52 O.-R. + 5.0 2.2 
Acuity, Right Eye (monocular 

with occlusion) Cl. = .80 O.-R. + 2.3 2.0 
Acuity, Left Eye (without 

occlusion on Ortho-Rater) . = .55 0.-R. + 4.5 2.1 
Acuity, Left Eye (monocular 

with occlusion) Cl. = .77 0.-R. + 2.1 2.1 





From the above correlations and other derived statistics the regression 
equations in Table 2 for prediction of clinical acuity values from Ortho- 
Rater scores have been obtained. 

Using the regression equations in Table 2, predictive tables have been 
set up for converting Ortho-Rater scores to their clinical equivalents 
(Tables 3 and 4). Near and distance tests give somewhat different 


* Test-retest reliabilities run previously on the Ortho-Rater gave coefficients of 
reliability of between .80 and .90. No reliability values are available for the clinical 
routine used in the present study; however, the lack of control of the distance on the 
near test would suggest a low reliability. 
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values and separate predictive tables are included for each set of tests. 
In all cases clinical and Ortho-Rater score values are expressed as ten 
times the reciprocal of the visual angle. 

Tables 3 and 4 give values resulting from the interpretation of the 
regression equations and indicate the mean expected score. An individual 
attaining a given Ortho-Rater score has a 50% chance of making the pre- 
dicted clinical score shown in Table 3 or 4, or better. For practical use 








Table 3 
Table for Predicting Clinical Distance Acuity Scores from Ortho-Rater 
Distance Acuity Scores 
Right Bye Right Bye Left Eye Left EY 
e e e 
Ortho- Acuity, (without one (without Pc ne 
Rater Both occlusion on with occlusion on with 
Score Eyes Ortho-Rater) occlusion) Ortho-Rater) occlusion) 
1 1.2 3.1 8 4.1 1.5 
2 2.1 3.8 1.8 4.7 24 
3 3.1 4.5 2.7 5.2 3.3 
4 4.0 5.2 3.7 5.8 4.2 
5 5.0 5.9 4.7 6.4 5.1 
6 5.9 6.6 5.7 7.0 6.0 
7 6.9 7.3 6.7 7.6 6.9 
8 7.8 8.0 7.6 8.1 78 
9 8.8 8.7 8.6 8.7 8.7 
10 9.7 94 9.6 9.3 9.6 
11 10.7 10.1 10.6 9.9 10.5 
12 11.6 10.8 11.6 10.5 11.4 
13 12.6 11.5 12.5 11.0 12.3 
14 13.5 12.2 13.5 11.8 13.2 
15 14.5 12.9 14.5 12.2 14.1 





of the information in referring employees for professional attention a 
second set of tables (Tables 5 and 6) has been prepared with the predicted 
value of one 8. E. of estimate above the value obtained using the regres- 
sion equation. In this way, an individual attaining a given Ortho-Rater 
score has 84 chances in 100 of not exceeding the predicted clinical score. 
This modified predicted score probably should be used for predicting 
clinical acuity in cases of employees referred to professional eye men for 
professional consultation. By using this conservative prediction, there 
is a marked reduction of the possibility of sending an employee for pro- 
fessional treatment who will achieve higher than the predicted score on 
the clinical test when given by the professional man. 

A similar treatment of lateral phoria scores is not included because 
(a) the correlations between clinical phoria scores and Ortho-Rater phoria 
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Table 4 


Table for Predicting Clinical Near Acuity Scores from Ortho-Rater Test 
Near Acuity Scores 








Acuity, Acuity, Acuity, Acuity, 
Right Eye Right Eye Left Eye Left Eye 
Ortho- Acuity, (without (monoc (without (monocular 
Rater ' Both occlusion on with occlusion on with 
Score Ortho-Rater) occlusion) Ortho-Rater) occlusion) 





5.5 3.1 5.1 2.9 
6.0 3.9 5.6 3.6 
6.6 4.7 6.2 4.4 
7.1 5.5 6.7 5.2 
7.6 6.3 7.3 6.0 
8.1 7.1 7.8 6.7 
8.6 7.9 8.4 7.5 
9.2 8.7 8.9 8.3 
9.7 9.5 9.5 9.0 
10.2 10.3 10.0 9.8 
10.7 11.1 10.6 10.6 
11.2 11.9 11.1 11.3 
11.8 12.7 11.7 12.1 
12.3 13.5 12.2 12.9 
12.8 14.3 12.8 13.7 
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Table 5 


Table for Modified Prediction of Clinical Distance Acuity Scores from 
Ortho-Rater Distance Acuity Scores 








Acuity, Acuity, Acuity, Acuity, 
Right Eye Right Eye Left Eye Left Eye 
Acuity, (without (monocular (without (monocular 
Both occlusion on with occlusion on with 
Eyes Ortho-Rater) occlusion) Ortho-Rater) occlusion) 





2.9 \ 2.9 6.5 3.3 
3.8 ‘ 3.9 7.1 4.2 
48 ‘ 4.8 7.6 5.1 
5.7 ‘ 5.8 8.2 6.0 
6.7 . 6.8 8.8 6.9 
7.6 X 7.8 9.4 7.8 
8.6 . 8.8 10.0 8.7 
9.5 \ 9.7 10.5 9.6 
10.5 : 10.7 11.1 10.5 
11.4 > 11.7 11.7 11.4 
12.4 . 12.7 12.3 12.3 
13.3 . 13.7 12.9 13.2 
14.3 A 14.6 13.4 14.1 
15.2 15.6 14.0 15.0 
16.2 16.6 14.6 15.9 
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Table 6 
Table for Modified Prediction of Clinical Near Acuity from Ortho-Rater 
Near Acuity Scores 
Rach i : Acuity, fae , ne , 
Ortho- Acuity, (without Fara tl (without (naeiaiion 
Rater Both occlusion on with occlusion on with 
Score Eyes Ortho-Rater) occlusion) Ortho-Rater) occlusion) 
1 4.3 7.7 5.1 7.2 5.0 
2 5.1 8.2 5.9 7.7 5.7 
3 6.0 8.8 6.7 8.3 6.5 
4 6.8 9.3 7.5 8.8 73 
5 7.7 9.8 8.3 9.4 8.1 
6 8.5 10.3 9.1 9.9 8.8 
7 9.4 10.8 9.9 10.5 9.6 
8 10.2 11.4 10.7 11.0 10.4 
9 11.1 11.9 11.5 11.6 11.1 
10 11.9 12.4 12.3 12.1 11.9 
11 12.8 12.9 13.1 12.7 12.7 
12 13.6 13.4 13.9 13.2 13.4 
13 14.5 14.0 14.7 13.8 14:2 
14 15.3 14.5 15.5 14.3 15.0 
15 16.2 15.0 16.3 14.9 15.8 





scores given in Table 1 are too low to justify an attempt at accurate 
prediction and (b) the low correlations ordinarily found between any two 
phoria tests would seem to preclude the use of such tables for industrial 
prediction. It should be mentioned, however, that in spite of the low 
correlations found, when the mean Ortho-Rater score of individuals who 
demonstrated clinical lateral orthophoria was used as the dividing point 
on the Ortho-Rater scores, 80% of the subjects gave phoria measurements 
in the same direction on the clinical and Ortho-Rater lateral phoria tests. 
This finding would seem to indicate that in a large majority of cases, the 
Ortho-Rater lateral phoria score does indicate at least the direction of a 
clinical lateral phoria test finding. 

Vertical phoria clinical tests in general do not divide a group as finely 
as the Ortho-Rater test. The scoring on the Ortho-Rater test has proved 
to be of value in industrial relations for selecting individuals suited 
visually for certain occupations. A more gross test appears to be satis- 
factory for general clinical purposes. Predictive data are not shown here 
for the same reasons as in the case of lateral phoria. 


Summary and Conclusions 


A testing program was conducted in an industrial eye clinic in an 
effort to find the relationship between scores on Ortho-Rater tests and 
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clinical tests. There were a total of 137 tests (both clinical and Ortho- 
Rater) run on 95 individuals. Pearson product-moment coefficients of 
correlations ranged from about .60 to .90. Although these correlations 
are rather low for individual prediction, a series of predictive tables suit- 
able for the needs of industry has been evolved. 

The data justify the following conclusions. 


1. Prediction from Ortho-Rater to clinical scores as measured in the 
present study can be made within reasonable tolerances for all acuity 
tests. 

2. The Ortho-Rater lateral phoria tests indicate the direction of the 
phorias revealed by clinical tests in 80% of the cases, although the cor- 
relations are too low to permit prediction of the amount of the lateral 
phoria. 

3. A measurement of the relation between clinical vertical phoria 
tests and Ortho-Rater vertical phoria tests could not be made because the 
clinical vertical phoria test scores approximated a point distribution. 


Received July 8, 1946. 











Occupational Differences in the Minnesota Multiphasic 
Personality Inventory * 


Willie Maude Verniaud 
Department of Tests and Measurements, Houston Public Schools 


The Minnesota Multiphasic Personality Inventory (MMPI) was 
given to 97 women in three contrasting occupations, as an aid in deter- 
mining whether or not there are occupational differences on this Inven- 
tory. It is the purpose of this paper to present findings of the investiga- 
tion. 

The Test 


The test consists of 550 statements printed on cards, filed by the 
subject under guide-cards marked “True,” “False” and “Cannot Say.” 
There are three validating indicators, (?), (L), (F), and 9 diagnostic 
scales: Hypochondriasis (Hs), Depression (D), Hysteria (Hy), Psycho- 
pathic Deviate (Pd), Masculinity-Femininity (Mf), Paranoia (Pa), 
Psychasthenia (Pt), Schizophrenia (Sc), and Hypomania (Ma). The 
Inventory was developed by a clinical psychologist (Dr. 8. R. Hathaway) 
and a neuropsychiatrist (Dr. J. C. McKinley) as an aid in identifying 
individuals in need of psychiatric attention. A large proportion of scores 
made by employed people on such an Inventory would be expected to 
lie within the ‘normal’ range, since individuals who work for a living are, 
by definition, ‘normal’ enough to maintain themselves in a paid job. 
Nevertheless, the selection of a clinical instrument for the investigation 
was deliberate. Under current psychological theory, the so-called 
functional disorders may represent extreme forms of personality tend- 
encies present in all of us to varying degrees, and become abnormal only 
when “out of bounds.” If this be true, then an instrument sensitive 
enough to be of value in identifying extreme deviates may be of value 
in identifying personality differences among functionally normal indi- 
viduals in contrasting occupations. 


‘The Subjects 


The subjects included 40 clerical workers, 27 department store sales- 
women, and 30 optical workers from an industrial plant engaged in 
making lenses and prisms for naval binoculars. 

* Material based on Master’s Thesis on file in library of the University of Minnesota, 
July 1945, prepared under the direction of Professors 8. R. Hathaway and D. G. 
Paterson. 
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Of the 40 clerical workers, 16 were employed in administrative de- 
partments of the City of Houston, 16 in administrative offices of the 
Houston Independent School District, 8 under various Federal, County 
or District officials. All worked directly under someone with executive 
or professional status, had duties including both paper work and personal 
contacts, and had had two years of experience as a minimum. None of 
the group were workers whose primary function was to direct others 
rather than serve a chief directly. All were completely free from re- 
sponsibility for original decision. 

The 27 saleswomen were drawn from three Houston department 
stores by asking an executive in each store to select ten individuals 
whom he considered good saleswomen. Three of the 30 so selected did 
not take the test. Those who did, represent 20 selling departments, as 
follows: Cosmetics 4, Housewares 2, Boys’ Clothing 2, Infants’ Wear 2, 
and 1 each from Lingerie, Men’s Furnishings, Basement Ready to Wear, 
Furs, Corsets, Handbags, Better Dresses, Housedresses, Millinery, 
Paints, Blouses, Automobile Accessories, Toys, Shoes, China-Glassware. 

The 30 optical workers were employed in one of the following pro- 
duction departments: blocking, roughing, emery grinding, polishing, 
finishing. All whose profiles were used had a minimum of four months’ 
experience in one or more of the above processing cperations. All had 
had some experience in an operation other than the one to which the 
operator was first assigned. The reason for this last requirement lay in 
the observed fact that Management provided no trained flying squad or 
other formal organization to cover emergencies and break bottlenecks, 
assuming the regular operator group to be sufficiently interchangeable 
that smooth flow of work could be maintained by reshuffling, also that 
an experienced operator could change methods as often as technicians 
modified machinery, materials or techniques. Both assumptions were 
true of the group as a whole. 

Tables 1 and 2 show the distribution of the three occupational groups 
according to age and education. 











Table 1 
Ages of Occupational Groups 
Clerical Sales- Optical 
Age Workers women Workers Total 
Under 20 4 1 7 12 
20-29 14 1 8 23 
30-39 12 il 10 33 
40 up 10 13 5 28 
n.d. — 1 —- 1 
Total 40 27 30 97 
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Table 2 
Education of Occupational Groups . ) 
Clerical Sales- Optical | 
Education Workers women Workers Total 
6th grade 0 1 0 1 | 
7th grade 0 0 2 2 | 
H.S. Undergr. 0 4 9 13 
H.S. Grad. 23 12 15 50 
Coll. Undergr. ll 6 2 19 
Coll. Grad. 6 0 2 8 
n.d. _ 4 — 4 
Total 40 27 30 97 
Results 


Figure 1 shows graphically the mean T-Score profiles of the clerical 
workers, saleswomen and optical workers. The heavy horizontal line 
represents mean score of the normative group on which T-Scale was 
based. 


Multiphasic Scales 
?. & 9 Hs D Hy Pad Mf Pa Pt Sc Ma 
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30 
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0 
Clerical 50 50 50 46 48 49 49 54 49 49 50 52 


Sales 51 50 50 47 47 50 50 58 45 47 47 52 
Optical 50 50 53 46 48 49 56 55 56 55 54 58 


Fie. 1. Mean T-Score profiles on MMPI for the clerical, sales and 
optical production groups. 
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The occupational profiles are markedly similar on the first three 
characteristics of the Scale, the “‘psychoneurotic triad’: Hypochondri- 
asis (Hs), Depression (D), Hysteria (Hy). ll fall below or at the norm 
mean-line. All three occupational means in Mascylinity (Mf) and in 
Hypomania (Ma) fall at some point above the norm mean-line. The 
clerical worker profile, however, remains reasonably flat throughout and 
rather closely approximates the norm mean, whereas the composite profile 
for saleswomen shows a sharp elevation at Masculinity (Mf). As soon 
as the optical worker profile leaves the ““Psychoneurotic triad’? (Hs, D, 
Hy), it mounts to a plateau relative to the mean of the norm group, with 
Hypomania (Ma) slightly elevated relative to this plateau. 

While the profiles show clear group characteristics, they do not yield 
information as to the probability that the observed differences between 
occupational and norm means arose solely from errors of random sampl- 
ing. For this, the reader is directed to Table 3, which shows raw score 
means, with per cent of occupational group reaching or exceeding norm 
mean in each characteristic (overlap), standard deviation from mean 
(S.D.), and ratio of difference to standard error of difference (C.R.). 
In the case of the following mean differences between occupational 
groups and the norm group, the null hypothesis can be rejected: 


Clerical Workers: Significantly lower mean score in Hypochondriasis; 

Saleswomen: Decided differentiation in Masculinity responses; 

Optical Workers: Definite differentiation in the direction of Hypomania 
and Psychasthenia, with statistically significant mean scores in 
Paranoia and Psychopathic Deviate. 


This would mean, for clerical workers, that one would logically ex- 
pect them as a group to show less tendency toward abnormal concern for 
bodily functions, less evidence of worry over their health than an un- 
selected sampling of women might show. However, 28% of the group 
reached or exceeded the norm mean in hypochondriasis, too large a pro- 
portion for a low score to have much occupational significance, taken 
alone. Since no other characteristic is clearly significant, we would 
expect to find either a tendency to have fewer marked deviations in the 
direction of abnormality than an unselected group, or so conflicting a set 
of individual job requirements that the result cancels out in combining 
the scores in one group profile. In other words, the clerical worker 
group would appear to be essentially an undifferentiated sampling of the 
“normal” population. 

Three department store saleswomen in 27 (11%), reach or exceed the 
norm mean in the direction of “femininity.”” To rephrase, since the 
T-Scores were reversed for this Scale when applied to females, 24 women 
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out of the 27 reach or exceed a T-Score of 50 in the direction of ‘‘mascu- 
linity.” Two of the remainder are at T-Score 49. A solitary score falls 
definitely at the “feminine” end of the Scale (raw score 44, T-Score 34). 
This score was investigated, found to represent a pleasant-mannered, 
middle-aged housewife, one year of college, no work experience outside 
the home until she entered the present department two years ago, con- 
sidered an outstanding saleswoman in that location, which is housewares. 
Despite the fact that these are all department store saleswomen, that is, 
women who wait on customers who come to them, this is the only defi- 
nitely “feminine” score found. Would a group of insurance saleswomen, 
say, or other saleswomen who must seek out their customers, tend to be 
highly selected in “masculinity”? The answer might help us evaluate 
what a woman’s score in “masculinity” means on MMPI. Unfortun- 
ately, there has been little research into the behavioral meaning of a 
score in this characteristic, when made by a woman. Further, it is the 
one Scale which was not related to a clinically diagnosed type (for fe- 
males). ‘To hazard a guess as to interpretation, it might mean a tend- 
ency to dominate and direct a situation rather than be dominated by it, 
a tendency to aggressiveness rather than passivity, and since many of the 
statements have to do with expressed interests and aversions, a tendency 
to share “‘masculine’’ interests to a greater extent than might be expected 
in an unselected sampling of women. 

Since 26 of 30 optical workers sampled, or 87%, reached or exceeded 
the norm mean in hypomania, and 77% to 83% of the same individuals 
also reached or exceeded the norm mean in the other “significant’’ 
characteristics: psychasthenia, paranoia and psychopathic deviate, we 
would be justified in suspecting that average or above-average scores in 
these characteristics may be related to something in the job, job environ- 
ment, or job relationships. In terms of the expected meanings of the 
characteristics, we would expect these workers as a group to be restless, 
“full of plans,” alternating between enthusiasm and over-productivity in 
energy output and modes of depression, more inclined toward anxieties 
and compulsive behavior than the average individual, disinclined (or 
unable) to concentrate for long periods on one task, somewhat oversensi- 
tive or suspicious of the good-will of others, somewhat more inclined than 
the average woman to disregard social mores. 

It would not be conservative, however, to draw inferences as to 
significance in an investigation of this kind on the basis of statistical 
evidence alone. The next step, therefore, was to study individual data 
on workers and their jobs. Because of unavoidable limitations, this 
could not be done for individual saleswomen, and analysis of the other 
occupational groups consisted of bringing together available material 
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rather than case study in the clinical meaning. Nevertheless, when the 
material was assembled, the available clinical evidence was as clear as the 
statistical in pointing to relationship betweeen type of work, type of 
worker and characteristic responses on MMPI. Space does not per- 
mit the presentation of the bulk of this evidence. The three cases used 
are illustrative rather than typical. Only five other individuals in the 
clerical worker group showed as little deviation of score as does the 
clerical worker used. Both optical workers stand out in their occupa- 
tional group. The last in particular was sufficiently deviate among her 
peers that around her small shop legends grew up. Here is also the 
most deviate profile. The case was selected for its suggestion of close 
association between rather specific job conditions and the needs of a 
particular personality. 


Illustrative Cases 


Clerical Worker No. 40: 250, L56, F50; Hs54, D49, Hy54, Pd56, Mf47, 
Pa50, Pt46, Sc39, Ma45 


This woman is 38, high school and business school graduate, with some 
extension work in business administration and in personnel management. She 
is divorced, shares an apartment with two women friends. She has been 
working for twenty years, worked throughout the episode of her marriage. 
Her real love is the department where she began when she finished business 
school twenty years ago and in which she has worked up to the position of 
secretary to the director. The director inherited his secretary, who is secure 
under Civil Service in the job regardless of changing administrations. She 
appears to have all of the secretarial virtues: an intense, protective loyalty to 
the man she is serving; disregard of clock hours; discrimination as to what 
should go to the director and what should be rerouted through someone else; 
subordination of personalities while on the job; finesse in handling visitors, 
applicants for favors, complainants, taxpayers, division chiefs and others 
clamoring for immediate attention from the director. Added to this, she knows 
the machinery of the department as she knows the palm of her hand. 

The most outstanding thing about this worker’s attitude is the impression 
one gets that she has “arrived.”’ There ap to be no unattained horizons, 
no yearnings for something not yet found. There — to be also a marked 
lack of haziness about the various demarcations of life. Things and ple 
belong in the place assigned. Senior clerks, juniors, division chiefs, and so on 
are accepted with no inquiry into what does not concern her, such as questions 
of varying qualifications, competence, etc. So long as the person occupies the 
assigned place, she is as loyal to each, respecting those above and below, as she 

is to the man appointed as director of “her” department. 


Optical Worker No. 5: ?50, L53, F50; Hs41, D42, Hy52, Pd67, Mf63, 
Pa67, Pt60, 8c58, Ma68 
This woman is 34, high school and business school graduate. She is 
married, with a daughter of 12. She started out as a clerical worker, worked 


up to the position of secretary to the editor of a local newspaper, advanced 
from this to newspaper reporter. However, she was put to reporting women’s 
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affairs, had no chance to do general reporting as the men did, and after two 
years left to work for a clothing manufacturer. She started out as his book- 
keeper, but also went out into the plant to do cutting. She disliked the book- 
keeping, “loved’’ the cutting, and for this reason thought she might like this 
new occupation for women, when a local university offered vocational courses 
in optics. She came in 17 months ago when “the whole thing was experi- 
mental; so small that one girl did what a department does now.’”’ She was 
“fascinated” by it; and has remained so, has never done anything that gives 
her such a “kick.” 

She is now supervisor of the polishers. By the shift foreman’s report, she 
has transformed the polishing room from one of their major headaches to a 
“bangup” job. She is exacting to work for, unpredictable as a person. She 
likes to do “‘screwball’’ things. Once she emerged from the polishing room 
with the red polishing compound smeared on hands and arms to the elbows, 
waved them in the investigator’s face, saying “Blood!” During the holidays, 
when you heard a jingle, you knew that the supervisor of polishers was going 
through. She had tied bells to her shoes. She is both liked and disliked 
enthusiastically, but not overlooked. ‘Finding out it takes a screwball to do 
this?”’ she hailed the investigator from across a room. ‘We oldtimers could 
have told you, you have to be a little crazy not to go crazy here.”’ 


Optical Worker No. 15: ?50, L50, F70; Hs65, D46, Hy56, Pd73, Mf66, 
Pa65, Pt65, Sc67, Ma81 


A second profile was obtained on this 19-year-old girl, as she is the best 
rocessor of optical glass in the roughing department, not excepting the rough- 
ing supervisor, who admits it. On the second administration, she sat beside 
the investigator. The cards were read to her, filed as she directed, with 
essentially the same results as the first time. Despite the high F-score, this is 
therefore believed to be a valid profile in the sense that it represents a filing of 
the cards as this subject wished-them filed. 

She is a high school graduate, finished the college entrance sequence with 
a B plus average by her statement (probably correct, as the keenness of her 
observations and comments indicated good scholastic aptitude as definitely as 
her performance indicated motor aptitude and space-perception equipment). 
She is unmarried, the daughter of a glass-cutter and glazier, who had his own 
business and taught her his trade. She was able to handle any operation in 
roughing, ‘‘pick up” any new mechanism and any alteration in procedure, and 
could teach it to others. The roughing processes, that is, the various opera- 
tions in the grinding of lenses and prisms, were the processes she limited herself 
to, but she would be “all over” that department, lending a hand because her 
quickness would leave her temporarily idle until stations feeding hers could 
catch up, and was also the first to be reshuffled when absenteeism or other 
emergency created bottlenecks—until she was fixed to one station by a deli- 
_ cate, complicated new machine, the one machine that “‘free’’ operators may 
not approach lest the operator be diverted. She mastered it. One picked 
operator at a time was then assigned her to teach, and she was consulted on 
whether the operator “has what it takes.” She liked the honor but disliked 
the “loneliness” of this station, and when her pleas to be released from this 
machine went unheeded, she did not remain long thereafter. 

“Vivid” and “colorful” are words that come to mind, recalling the impact 
of this young person on others. Everyone knew her, in and out of her depart- 
ment. Returning to town at midnight, other young people would wait for a 
bus that she would be on. She usually sang aloud. Yet occasionally, she 
would have to be roused to her expected role: ““Why are you so quiet tonight?” 
“Can’t sing all the time.” But presently, she did, the others joining. She 
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often led the whistling and wolf-calls that greeted embarrassed males who 
boarded this bus loaded with women. 

On last accidental contact, on a city bus, this subject reported that she 
had physical and mental tests and was accepted as a cadet nurse. One 
wonders what happened after that. 


Summary and Conclusions 


Samplings of clerical workers, department store saleswomen and 
women optical workers in a newly opened local industry show character- 
istic differences in responses on the Minnesota Multiphasic Personality 
Inventory. These differences are strong enough to indicate that occupa- 
tional differences in personality, although slight, may be measurable and 
significant. 

The clerical workers approach most closely to a normal sampling, the 
only clear differention being a group tendency toward lower scores in 
hypochondriasis. The saleswomen are strongly differentiated from the 
normal sampling in a tendency toward responses designated as ‘‘mascu- 
line” on this Inventory. The sampling of industrial women deviates 
from the normal sampling in several respects, a tendency toward hypo- 
mania being particularly marked. 

These findings must be interpreted in the light of the particular oc- 
cupational settings. They may not be valid for workers doing similar 
work, but under very different job conditions. One conclusion can be 
drawn from this investigation: There are group differences in the per- 
sonality of successful workers corresponding to gross differences in job 
requirements, and some of these differences may be identified by re- 
sponses on the Minnesota Multiphasic Personality Inventory. 


Received December 6, 1945. 
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The Effect of an Increasingly Well Defined Criterion on 
the Prediction of Success at Naval Training 
School (Tactical Radar) * 


Dewey B. Stuit, Lt. Comdr., USNR, and 
John T. Wilson, Lt. Comdr., USNR 


Bureau of Naval Personnel and Headquarters, Commander in Chief, 
United States Fleet 


This study was undertaken to investigate the validity of the tech- 
niques used to select officers for tactical radar training, and to determine 
the suitability of the criteria of success employed at Naval Training 
School (Tactical Radar).' 

Research on the validity of selection requirements for Naval Training 
School (Tactical Radar) was preceded by a National Defense Research 
Committee exploratory study of techniques employed in the selection 
of officer personnel for a similar school at St. Simons Island, Georgia. 
In this preliminary study it was found that the best students were those 
who made high scores in the Officer Qualification Test, Relative Move- 
ment Test, Polar-Grid Coordinate Test, and low scores in the Personal 
Inventory (Enlisted Form). In addition, it was found that in general 
the best students were those who had had either administrative or man- 
agerial experience and who were judged by trained interviewers to be 
quick and accurate thinkers. On the basis of these results, selection re- 
quirements were tentatively established for Naval Training School 


* The opinions expressed in this article are those of the authors and are not to be 
construed as reflecting the policies or opinions of the Navy Department. 

1 Training in the tactical application of radar may be contrasted to training in the 
technical phases of radar, in that the NTSch (Tactical Radar) graduate is assigned to a 
billet aboard ship in Combat Information Center (CIC). The functions of CIC are 
operational or tactical in nature; they are specifically, to keep the Commanding Officer 
and other higher echelons of command aboard informed of the location, identity, and 
movement of friendly and/or enemy forces within the area, to control aircraft and small 
craft in the area, to aid in navigation and to indicate targets. The CIC is manned by 
a “team” composed of a “CIC Officer” who is in charge, varying numbers of “CIC 
Watch Officers” qualified to control aircraft and to supervise plotting facilities, and a 
larger number of enlisted personnel who act as radar operators, telephone talkers, 
plotters, and communications yeomen. Technical radar officers on the other hand are 
assigned to the relatively more individualized duty of maintenance of electronic equip- 
ment, particularly radar. Tactical radar training consists of eight weeks basic training 
in the tactical employment of radar and in the other related functions of CIC. 


614 








Prediction of Success at Naval Training School 615 


(Tactical Radar) the main features of which are: (a) a minimum Navy 
Standard Score of 50 in the Officer Qualification Test; (b) high scores in 
the Tractical Radar Aptitude Test? and Relative Movement Test; * 
(c) evidence of piloting ability as indicated by preliminary estimate of 
work in previous navigation courses; (d) maturity of behavior and judg- 
ment with a preferred age range from 22-33; (e) educational and occupa- 
tional backgrounds indicating a high degree of verbal ability, (f) informed 
volunteer status; (g) physically qualified for sea duty. 


Procedure 


Selection Tests Used. The selection tests used in this study were as 
follows: (a) Officer Qualification Test, (b) Officer Classification Test, (c) 
Tactical Radar Aptitude Test, (d) Relative Movement Test, and (e) 
CIC Aptitude Test, Form 2. 

The Officer Qualification Test was originally designed for use by Of- 
fices of Naval Officer Procurement in determining the qualifications of 
candidates seeking commissions. The test consists of 100 items; 50 
Verbal, 20 Mathematical and 30 Mechanical Comprehension. The 
score is the number right in the total test. 

The Tactical Radar Aptitude Test consists of three parts: Polar- 
Grid Coordinate, Ratio Estimation and Coordinate Reading. The first 
part measures the examinee’s ability to translate the reading of a point 
on a polar coordinate to a grid coordinate chart. Part two is a test of the 
ability to estimate relative lengths of lines presented in pairs. Part three 
measures ability to estimate the direction and range of targets on a 
polar coordinate chart. There are 50 items in part one, 90 in part two 
and 90 in part three. Standard scores are computed for the separate 
tests as well as the total test. 

The Relative Movement Test consists of 30 items designed to measure 
the ability to visualize the relative movement of ships, involving the 
determination of direction, distance, or speed of ships. Basically the 
test appears to be a measure of spatial relations ability, but presents 
problems in a “‘navigational’’ setting. 

The CIC Aptitude Test, Form 2, consists of three parts: Polar-Grid 
Coordinate, 45 items, Scale Reading, 60 items, and Relative Movement, 
45 items. The Polar-Grid Coordinate and Relative Movement Tests 
are revisions of the tests of the same name described above. The Scale 


* The Polar-Grid Coordinate Test, Ratio Estimation Test and Coordinate Reading 
Test comprising the Tactical Radar Aptitude Test were originally developed by NDRC 
Project NS-146. 

* The Relative Movement Test was originally developed by the University of Cali- 
fornia Division of War Research, Section 6.1, NDRC. 
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Reading Test, originally developed by NDRC Project NS-146, measures 
the ability to read scales of various kinds with speed and accuracy. 
Scores are computed for the three parts as well as for the test as a whole. 

The Officer Classification Test was designed for use at reserve mid- 
shipmen and indoctrination schools in classifying officers and officer 
candidates for advanced school training or billet assignments. The four 
parts of the test are as follows: Verbal, 60 items; Mechanical, 90 items; 
Mathematical, 45 items; and Spatial, 60 items. Scores are reported only 
for the parts of the test. 

Administration of Selection Tests. The Officer Qualification Test, 
Tactical Radar Aptitude Test, and Relative Movement Test were ad- 
ministered to all students enrolled in the first two classes at the school. 
Students reporting from indoctrination or reserve midshipmen schools 
were tested before their arrival at NTSch (Tactical Radar) and were 
selected for training upon the basis of their performance in the tests and 
an appraisal of their personal qualities by interviewing officers. Of- 
ficers reporting from other shore establishments and fleet commands were 
tested after reporting to NTSch (Tactical Radar). 

The Officer Classification Test was first administered to the third 
class. During the summer of 1944 the routine administration of this test 
at indoctrination and reserve midshipmen schools was instituted. Re- 
ports of scores made by officers subsequently recommended for tactical 
radar training were sent to the Bureau of Naval Personnel. 

On the basis of preliminary results with the Tactical Radar Test 
Battery, a revised selection test, the CIC Aptitude Test, Form 2, was 
constructed. This test was administered at NTSch (Tactical Radar) to 
approximately one-half of the officers enrolled in the seventh class and 
to all members of the eighth and ninth classes. The members of the 
seventh class took the test after they had completed their tactical radar 
training. Class 8 had completed half of its course of training, and class 
9 was in the first week of training when the test was administered. 

Criterion Measures Used. The final course grade for the first three 
graduating classes consisted of a weighted average of the “theory” and 
“practical” grades assigned each student. The “theory” grade con- 
sisted of the arithmetical average of all grades received in weekly quizzes. 
The “practical” grade was based upon ratings of the student’s perform- 
ance during simulated battle problems, in the school’s CICs (Combat 
Information Centers). The ratings, using a scale of 1 (high) to 5 (low), 
were made on the following traits: leadership, team work, judgment, 
mental agility, surface target plotting, air target plotting and speech. 
The “practical” rating, expressed in terms of the Navy grading system 
(from zero to 4.0 with 2.5 equaling a minimum passing grade), comprised 
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two-thirds of the final grade, which was also expressed in the Navy 
grading system. 

In the case of classes 5 and 6, the final grade consisted of the arith- 
metical average of the “theory,” “practical’”’ grades and the added factor 
of a comprehensive final achievement examination grade. The “‘theory”’ 
and “practical” grades were computed in the same manner as for classes 
1, 2 and 3. The final achievement examination grade consisted of the 
student’s score in the CIC Final Achievement Examination, Experi- 
mental Form 1. This test consisted of 240 multiple choice items and 
was administered experimentally to these two classes (5 and 6). 

The final grade for classes 7, 8 and 9, consisted of the arithmetical 
average of the “‘theory” grade based upon the first month’s course grades, 
a “practical” grade based upon grades made during the second month 
of the course and performance in a comprehensive “practical examina- 
tion” and the grade in the CIC Final Achievement Examination, Form 
2. The latter test consisted of 240 items and is comparable to the CIC 
Final Achievement Examination, Form 1, administered to classes 5 and 
6. 

Method of Analyzing Data. The predictive value of each of the selec- 
tion tests was determined by correlating test scores with the various 
criteria of success, using the Pearson product-moment coefficient of cor- 
relation as the index of relationship. 


Results 


Correlations of Aptitude Tests with Criteria of Success 


Tactical Radar Aptitude Test Battery. The coefficients of correlation 
showing the relationship between the tests comprising the Tactical Radar 
Aptitude Test Battery and the criteria of success are shown in Table 1. 

These correlations show that the original Tactical Radar Aptitude 
Test Battery was not very predictive of success at NTSch (Tactical 
Radar). In general, the predictive measures correlated higher in the 
case of classes 5 and 6 than they did in the case of classes 1 and 2. It 
seems probable that this increase in correlation should be attributed to 
the changed criteria of success. The results with classes 7, 8 and 9 
indicate, however, that the substitution of the second month grade for 
the “practical” grade did not result in any significant change in the cor- 
relations, with the exception of the relationship between the Officer 
Qualification Test and the achievement test and final grade. This is 
probably evidence of the fact that the Tactical Radar Aptitude Test and 
the original Relative Movement Test were in themselves not highly 
effective predictors of success. 
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Table 1 


Correlation of Measures Comprising Original Tactical Radar Aptitude Test Battery 
with Criteria of Success at NTSch (Tactical Radar) 








The Correlation Coefficients Show by Classes (*) Relationship 
Between Tests of the Tactical Radar Aptitude Test Battery 
. and Various Criteria of Success. The Number of Cases (**) 














in Each Class is Also Indicated 
Theory Practical Final Achiev. Final Achiev. Final 
Grade Grade Grade Test Grade Test Grade 
Tactical Radar *1 2 1 2 1 2 5,6 5,6 7,8,9 7,8,9 
Aptitude Test 
Battery **97 #110 97 «+110 97 110 += 108 103. +110*** 110*** 
1. Officer 
Quailif. 33 17 Ol O7 06 10 82 31 44 Al 
2. Polar-Grid 
Coord. 19 03 .29 20 31 19 27 19 .26 24 
3. Ratio 
Estimation .07 -—.19 .26 —.02 .26 -.06 02 -—.08 .17 21 
4. Coord. 
Reading —06 —.14 .18 13 = =O 08 .34 28 07 02 
5. Total 
(2+3+4) 10 -—.14 31 a ee 05 .22 ee 17 
6. Relative 
Movement 33 05 .20 —.04 .23 —.03 .25 .22 .23 18 





*** Approximate number of cases varied from 105 to 115 for the six tests. 


Officer Classification Test. The scores in the four tests comprising the 
Officer Classification Test correlated very low. with the criteria of success 
employed with class 3 (see Table 2). Again, results shown in this table 
demonstrate the effect upon the obtained correlations of a change in the 
criterion measures. For class 3, only the relationship between the verbal 
test and the final grade is above the 5 per cent level of significance. For 
classes 7, 8 and 9 the correlations for both the verbal and mathematical 
parts are well above the 1 per cent level of significance. It is also to be 
noted that the correlations with the final achievement examination are 
somewhat higher than the correlations with final grades. The most 
striking fact, however, is the high relationship between the mathematical 
part of the Officer Classification Test and both the achievement test and 
the final grade. Evidently this factor was much more closely associated 
with success in later classes than it was in the early classes. 

Since the correlations for class 3 are based upon a relatively unre- 
stricted population, and the correlations for classes 7, 8 and 9 are based 
upon a population which is considerably restricted, the increase in cor- 
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Table 2 


Correlations of Officer Classification Test Scores with Criterion Measures at NTSch 
(Tactical Radar) Presented by Classes 








The Correlation Coefficients Show the Relationship 
Between the Parts of the Officer Classification Test 
and Various Criteria of Success 


Class 3 Classes 7, 8, 9 
N = 178 N = 83 
Officer 


Classification Theory Practical Final Achiev. Final 
Test Parts Grade Grade Grade Grade Grade 














1. Verbal 16 13 18 ol 32 
2. Mechanical — .06 07 04 .30 19 
3. Mathematical 07 —.04 01 44 49 
4. Spatial —.09 01 —.02 13 08 





relations is all the more significant. Again it appears that the increase 
can be attributed to the changed criteria of success in training. 

CIC Aptitude Test. The effect of revising both the predictive meas- 
ures and the criterion measures is demonstrated by the results presented 
in Table 3. Correlations presented in this table indicate that the CIC 
Aptitude Test is effective in predicting success in training of Officers at 


NTSch (Tactical Radar). Whereas, the correlations of the original 
Tactical Radar Aptitude Test Battery (Table 1) fluctuated considerably 
from class to class, these correlations represent a stable picture through- 
out, for the parts of the test as well as for the total score. It is also 
interesting to note that the test correlates as well with the final grade as 


Table 3 


Relationship Between CIC Aptitude Test Scores and Criteria of Success for 
Classes 7, 8 and 9 at NTSch (Tactical Radar) 








The Correlation Coefficients Show by Classes the Relationship 
Between the Parts and Total Score of the CIC Aptitude Test 
and the Criterion Measures 


Class 7 Class 8 Class 9 
N = 66 N = 123 N = 117 


CIC Aptitude Achiev. Final Achiev. Final Achiev. Final 
Test Scores Test Grade Test Grade Test Grade 

















Part 1 49 39 37 45 42 43 
Part 2 55 46 38 51 56 56 
Part 3 48 31 40 51 A7 39 
Total 61 45 45 55 57 56 
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with the achievement test grade. The correlations with final grade ob- 
tained for classes 8 and 9 are probably about as high as could be expected 
for an officer training prediction study. 


Interrelationship of Criterion Measures 


In a prediction study, one of the crucial factors is the nature of the 
criterion against which predictive indices are to be correlated. Some 
light on the nature of the criterion is shed by the results presented in 
Tables 4 and 5. The striking fact in Table 4 is the low correlation be- 














Table 4 
Intercorrelations of Criterion Measures 
Class 5 
N = 134 
Achiev. Theory Practical 
Test Grade Grade 

Achievement Test 
Theory Grade 59 
Practical Grade All 13 
Final Grade .80 .89 31 





tween the “practical” grade and the “theory” grade and between the 
“practical” grade and the achievement test score. The correlation be- 
tween achievement test scores and “theory” grades of .59 indicates that 
while these two measures are definitely related, they do measure some- 
what different facets of the student’s knowledge. 

It should be remembered that the “practical” grade consists of the 
rating which was made of the officer’s performance in CIC. While one 
would not expect a perfect correlation between such a measure and the 
officer’s performance in class work, it does not seem reasonable that the 
relationship should be as low as shown in this table. If knowledge about 
a subject contributes to performance, one would expect the relationship 
to be higher, since the rating purports to be an evaluation of the officer’s 
performance in the CIC. 

The intercorrelations in the case of class 9 are presented in Table 5. 
These data indicate that elimination of the rating of performance in CIC 
resulted in a substantial increase in the intercorrelation of criterion meas- 
ures. The correlation of .66 between the achievement test and the 
second month grade and of .69 between the first month and second month 
grades correspond more nearly to what one would expect the correlations 
between the different criteria of success to be. In interpreting the cor- 
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Table 5 


Intercorrelations of Criterion Measures 








Class 9 
N = 117 





Achiev. First Month Second Month 
Test Grade Grade 





Achievement Test 

First Month Grade .78 

Second Month Grade .66 .69 

Final Grade 91 .93 84 





relations between the different criterion measures and final grades, it 
should be remembered that they are spuriously high due to the fact that 
each criterion measure makes up one-third of the final grade. 


Reliabilities of Predictive and Criterion Measures 


The reliability coefficients of the predictive and criterion measures are 
shown in Table 6. They were computed by the Kuder-Richardson 
method except for the “theory” grade and the “practical” grade. The 
reliability of the “theory” grade was computed by correlating weekly 
grades for the odd numbered weeks of the course with those for the even 


numbered weeks of the eight-week course. The reliability of the “‘prac- 


Tabel 6 
Reliabilities of Predictive and Criterion Measures 








Officer Qualification Test 92 
Officer Classification Test 
1. Verbal .92 
2. Mechanical 83 
3. Mathematical .78 
4. Spatial ~ 81 
Tactical Radar Aptitude Test .90 
1. Polar-Grid Coordinate 89 
2. Ratio Estimation 85 
3. Coordinate Reading .67 
Relative Movement Test 62 
CIC Aptitude Test 92 
1. Polar-Grid Coordinate 85 
2. Scale Reading 85 
3. Relative Movement 82 
Theory Grade : 
Practical Grade 77 
Achievement Test 85 








622 Dewey B. Stuit and John T. Wilson 


tical” grade was obtained by correlating the ratings assigned by two 
different raters. The class for which this particular reliability coefficient 
was computed was unique in that two ratings were available for every 
officer in every trait. For the majority of classes only fragmentary 
information was available concerning any one officer. For this reason 
it seems justifiable to conclude that this reliability coefficient represents 
the upper level of reliability for the “practical” grade. 


Discussion 

The most significant finding of this study is the influence of the nature 
of the criterion upon the relationship between predictive indices and 
measures of success. The low correlations obtained with the early classes 
included in this study can in part be attributed to the fact that the 
criterion of success was not predictable by means of the types of tests 
used. Whether the criterion was appropriate for the type of course 
offered at NTSch (Tactical Radar) is, of course, a different question. 
However, the low relationship between “theory” and “practical” grades 
and the observations of the staff at NTSch (Tactical Radar) that rating 
procedures were not operating properly, lend credence to the belief that 
an improved criterion resulted from the introduction of objective course 
examinations and the final achievement examination. Certainly it can 
be said that the criterion in use with classes 7, 8 and 9 was predictable 
while the one in use with classes 1, 2 and 3 was not. 

A second major finding is the fact that the success in training of 
officers who are candidates for a specialized operational billet aboard a 
combatant vessel can be predicted with considerable efficiency. Since 
Naval officers represent a select population, it might have been assumed 
that any individual who is qualified to be a Naval Officer could qualify 
for a specialized operational billet such as Combat Information Center 
Officer. The results obtained in this study indicate that there are im- 
portant individual differences among Naval Officers and that for a special- 
ized operational billet some officers are definitely better qualified than 
others. This emphasizes the importance of careful screening of candi- 
dates for specialized operational billets, such as tactical radar, as well as 
for highly technical billets such as engineering and technical radar. 

A third fact which is evident in this study is the need for continuous 
refinement and improvement of predictive indices. The original tests 
and selection requirements used in this study represent good estimates of 
what was required to aid interviewing officers in selecting suitable candi- 
dates for tactical radar training. Results soon showed, however, that 
the Tactical Radar Aptitude Test and Relative Movement Test could be 
revised and improved. Revision of these tests, resulting in the con- 
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struction of the new CIC Aptitude Test, brought about substantial im- 
provement in prediction coefficients. Results such as these underscore 
the need for continuous research if selection requirements are to remain 
valid. 

A fourth significant fact is the outstanding quality of the officers sent 
from indoctrination and reserve midshipmen schools to NTSch (Tactical 
Radar). The high average scores made in the Officer Classification Test 
by the members of classes 7, 8 and 9 indicate that tactical radar officers 
are drawn from the upper 15 per cent of the Naval Officer population. 
This fact contributes in part to some of the low correlations which were 
obtained in this study. Since very few failures occurred in the training, 
it was not possible to correlate predictive indices with a ‘“‘success-failure”’ 
criterion. If an unselected population of officers had been sent to NTSch 
(Tactical Radar), the obtained correlations would have been higher, but 
it would also have resulted in a larger failure rate and, in addition, the 
fleet would have received tactical radar officers of markedly lower caliber. 


Summary 


The purpose of this study was to investigate the validity of the tech- 
niques used to select officers for tactical radar training and to determine 
the suitability of the criteria of success employed at Naval Training 
School (Tactical Radar). In the main, it was found that: 


1. The Officer Qualification Test, Tactical Radar Aptitude Test, and 
the Relative Movement Test did not correlate highly with the scholastic 
success of students renrolled in the early classes. 

2. The verbal and mathematical parts of the Officer Classification 
Test and the CIC Aptitude Test which was a revision of the Tactical 
Radar Aptitude Test showed substantial correlations with scholastic 
success of later classes. 

3. The increased magnitude of the correlations obtained with the 
latter tests is partially attributable to the refinement of the criteria of 
success by the introduction of objective course examinations and the use 
of a comprehensive final achievement test. 


Received December 31, 1945. 











Prediction of Achievement in Typewriting and Stenography 
in a Liberal Arts College 


Dorothy M. Barrett 
Hunter College of the City of New York 


Mindful of the day when they must seek employment in a world for 
which an A.B. degree may be inadequate vocational preparation, many 
Hunter College students have been adding typewriting and stenography 
to their studies in the liberal arts. Because a number of these young 
women discovered after a considerable investment in time that they 
lacked the necessary aptitude, it seemed imperative to try to find the 
means of predicting in advance a student’s probable degree of success or 
failure in these subjects. 

The results obtained in this study could be used either in the selection 
of students for a course when more students register than can be ad- 
mitted, or to counsel the individual student who is debating the advis- 
ability of studying typewriting and shorthand. 


Procedure 


A total of 96 students who had registered in the course in typewriting 
and 75 students who had registered for stenography took the tests. Ad- 
ministered after the students had signed into the courses but before they 
had begun class work, the tests included Bennett’s Stenographic Aptitude 
Test, the Kuder Preference Record, the MacQuarrie Test for Mechanical 
Ability, the Minnesota Vocational Test for Clerical Workers, Strong’s 
Vocationai Interest Blank for Women, Thurstone’s Vocational Interest 
Schedule, and the Turse Shorthand Aptitude Test. 

Final grades in each course were taken as the criteria of success. 
The final grades were based on speed and accuracy as demonstrated in 
class periods and in tests administered at the end of the course. End 
term grades were in no way influenced by the aptitude and interest test 
scores inasmuch as no instructor knew the results of these tests. 

Any student who earned an A or B in the course under discussion will 
be referred to in this study as good; any student who earned D or F will 
be called poor or a failure, even though D is a passing grade for the 
course. Because students who earned C seemed to be neither true 
failures nor true successes, ‘they have been classified separately. 
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Results in Typewriting 


In Table 1 are listed the tests which differentiated between good and 
poor typists. Also included in the table are the chances for getting an 
A or B, C, D or F for each of several ranges of scores for each test. 


Table 1 


Showing Distribution of Grades in Typewriting for Several Ranges of Scores for 
Each of Several Aptitude Tests 








Grades 
AorB Cc D or F 
Tests Chances in 100 





Minnesota Vocational Test for 
Clerical Workers 
Number Comparison 
150 and over 
100-149 
Below 100 
Name Comparison 
150 and over 
130-149 
Below 130 
MacQuarrie Test for Mechanical Ability 
Tracing Test 
50 and over 
0-49 
Dotting Test 
21 and over 
0-20 
Pursuit Test 
22 and over 
14-21 
0-13 ( 7) 
Turse Shorthand Aptitude Test 
Total Score 
420 and over (30) 
Below 420 (66) 
Total Undifferentiated Group (96) 





Both parts of the Minnesota Vocational Test for Clerical Workers 
distinguished students who earned A or B grades from those students who 
earned C, D or F grades at the end of the term, a fact which would be 
anticipated by the work reported by Andrew and Paterson! and by 


1 Andrew, D. M., and Paterson, D.G. Measured characteristics of clerical workers. 
Bull. of Empl. Stab. Res. Inst., Univ. of Minn., 1934, Vol. III, No. 1, pp. 1-60. 
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Eriksen.? In fact, the test of number comparison differentiated between 
good and poor students more accurately than any of the other tests in- 
cluded in this study. The chances for an A or B were 83 in 100 for those 
students who scored 150 and over on the number comparison test with 
only four chances in 100 for failing. The prevailing chances for the 
group as a whole for an A or B were 65 in 100 with 24 chances in 100 for 
failing. 

The pursuit, tracing and dotting parts of the MacQuarrie test also 
differentiated between good and poor typists. The other parts of the 
MacQuarrie test differentiated poorly or not at all. Although Turse 
makes no claim that his test predicts efficiency outside of the field of 
shorthand, the composite total score did differentiate to a fair degree 
between good and poor typists. 


Table 2 


Showing the Distribution of Grades in Typewriting for Students Selected by 
Successively Imposed Critical Scores 








Grades 
AorB Cc D or F 





Test Scores Chances in 100 No. 

Number Comparison—150 or over 83 13 4 (23) 

Name Comparison—150 or over 75 15 10 (20) 

Tracing—50 or over 85 0 15 (13) 

Undifferentiated Group Remaining 57 13 30 (23) 
Below 22 on Dotting and below 22 

on Pursuit 23 12 65 (17) 





For the group of 96 students taking typewriting, no relationship to 
grades was found for the two scores on the Bennett Stenographic Aptitude 
Test, the Commercial factor on the Thurstone Interest Schedule, the 
ratings for General Office Worker or for Stenographer-Secretary on the 
Strong Vocational Interest Blank for Women, the ratings for Clerical 
Interest on the Kuder Preference Record, nor for the remaining parts of 
the MacQuarrie test. 

An attempt was made to increase the effectiveness of the predictions 
of grades in typewriting by combining the results of several of the tests. 
Table 2 shows the distribution of grades when the scores for the number 
comparison, name comparison, tracing, dotting and pursuit tests were 
used successively or in combination to differentiate between good and 
poor students. This particular combination of test scores proved to 
yield the best results. 


? Eriksen, E. G., et al. A demonstration of individualized training methods for 
modern office workers. Bull. of Empl. Stab. Res. Inst., Univ. of Minn., 1934, Vol. IIT, 
No. 2, pp. 1-60. 
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We began by eliminating from further consideration the 23 students 
with scores of 150 or over for number comparison. Reference to Table 
2 will show that these students had 83 chances in 100 for an A or B and 
only four chances in 100 for a D or F. 

For the 73 students who remained of the original 96, we set a minimum 
score of 150 for name comparison, thereby identifying twenty more 
students. These students had 75 chances in 100 for an A or B and only 
10 chances in 100 for a D or F. 

Next, a minimum score of 50 on the tracing test was set, thereby 
picking out another 13 students whose chances for an A or B were 85 in 
100 but whose chances for failure were 15 in 100. 

At this point, 56 students had been eliminated from the group, leav- 
ing 40 students. From this group it was possible to sort out 17 students 
who had a score below 22 on both the dotting and pursuit parts of the 
MacQuarrie test. Their chances for success were only 23 in 100, and for 
failure were 65 in 100. 

The remaining undifferentiated group of 23 students did not yield to 
further analysis in the effort to sort out the good from the poor students 
on the basis of test scores. ‘These students can be identified in Table 2 
as the undifferentiated group having intermediate chances for success and 
failure. 

The author concluded that the administration of the Minnesota 


Vocational Test for Clerical Workers, and the tracing, dotting and pur- 
suit parts of the MacQuarrie Mechanical Ability Test represented a 
brief but effective combination of tests which would provide a fairly good 
estimate of aptitude for typewriting as taught at Hunter College. 


Results in Stenography 


In Table 3 are listed the tests which differentiated between good and 
poor stenography students. Also listed in the table are the chances for 
getting an A or B, C, or D or F for each of several ranges of scores for 
each test, as well as for the total group undifferentiated by any test scores. 

On the basis of the distribution of grades for the 75 students of 
stenography in this study, without reference to any test scores, a student 
had 72 chances in 100 for an A or B, 21 in 100 for a C, and seven in 100 for 
aDorF. Any test which yielded scores which resulted in greater chances 
for success or greater chances for failure than those which prevailed for 
the group as a whole was judged a useful measure. 

The transcription scores on the Turse Shorthand Aptitude Test, the 
pursuit and blocks scores of the MacQuarrie Test for Mechanical Ability, 
and the number comparison scores for the Minnesota Vocational Test for 
Clerical Workers differentiated most clearly between those students who 
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Table 3 
Showing Distribution of Grades in Stenography for Several Ranges of Scores for 
Each of Several Tests 
Grades 
AorB Cc DorF 
Tests Chances in 100 No. 
Minnesota Vocational Test for 
Clerical Workers 
Number Comparison 
150 and over 85 8 7 (13) 
110-149 72 23 5 (53) 
0-109 56 33 ll ( 9) 
MacQuarrie Test for Mechanical Ability 
Tapping 
48 and over 81 10 9 (21) 
0-47 69 26 5 (54) 
Dotting 
26 and over 81 19 0 (16) 
0-26 70 22 8 (59) 
Copying 
45 and over 75 25 0 (12) 
25-44 76 16 8 (42) 
0-24 67 24 9 (21) 
Blocks 
14 and over 86 4 10 (21) 
0-13 67 28 5 (54) 
Pursuit 
24 and over 89 8 3 (26) 
0-23 63 29 8 (49) 
Bennett’s Stenographic Aptitude Test 
Transcription 
110 and over 77 19 4 (53) 
0-109 59 14 27 (22) 
Spelling 
38 and over 78 17 5 (62) 
46 38 16 (13) 
Turse Shorthand Aptitude Test 
Phonetic Association 
48 and over 78 18 4 (44) 
0-47 65 26 9 (31) 
Transcription 
70 and over 85 15 0 (20) 
0-69 67 24 4 (55) 
Kuder Preference Record 
Clerical 
55 and over 76 24 0 (37) 
0-54 69 18 13 (38) 
Total Undifferentiated Group 72 21 7 (75) 
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did well in stenography and those students who did only average or poor 
work. Scores on the tapping, dotting and copying parts of the Mac- 
Quarrie test, scores for transcription and for spelling on Bennett’s Steno- 
graphic Aptitude Test, and clerical scores on the Kuder Preference 
Record gave predictions which were an improvement over those predic- 
tions which could be made with no test results, but were less effective 
than the first mentioned sets of scores. The actual effectiveness of each 
test can be read directly from Table 3. 

The scores for the remaining parts of the Turse Stenographic Aptitude 
Test and the MacQuarrie Test for Mechanical Ability as well as the scores 
for Stenographer-Secretary on the Strong Vocational Interest Blank for 
Women and the scores for Commercial interest on the Thurstone Voca- 
tional Interest Schedule failed to show any significant relationship to 
grades in stenography for the group of students studied. 

The effectiveness of the predictions of grades in stenography was 
improved by combining the results of several tests. Table 4 shows the 
results when the Turse transcription scores, the pursuit scores for the 
MacQuarrie test, the number comparison scores for the Minnesota test, 
and the phonetic association scores of the Turse stenographic test were 
used in combination. 


Table 4 


Showing the Distribution of Grades in Stenography for Students Selected by 
Successively Imposed Critical Scores; 








Grades 
AorB Cc D or F 
Test Scores Chances in 100 No. 





Turse Transcription—70 or over 

or é (36) 
Pursuit—24 or over 
Number Comparison—150 or over ( 5) 
Turse Association—48 or over (21) 
Remaining Group (13) 
Total Unclassified Group (75) 





From the use of a score of 70 or over on the Turse transcription test 
or a score of 24 or over on the pursuit test, 36 students were singled out 
who had 89 chances in 100 for an A or B and might therefore have been 
accepted for training at once. Of the remaining group of students, 5 
had a score of 150 or over on the number comparison test. These 
students had 80 chances in 100 for an A or B and might also have been 
accepted for the course. 
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Next it was possible to single out of the students still remaining, a 
group of 21 students with 67 chances in 100 for success and with 33 
chances in 100 for being only average or poor. Finally, 13 individuals 
remained who had limited chances for success and 67 chances in 100 to be 
only average or poor. Individuals in both of these last two groups might 
well have been warned that they would have to exert themselves to be 
able to compete successfully with the individuals with higher scores. 

In conclusion, then, as a basis for advising students about to study 
stenography, the data seemed to warrant the use of the transcription and 
phonetic association tests of the Turse Shorthand Aptitude Test, the 
pursuit scores of the MacQuarrie Test for Mechanical Ability, and the 
number comparison test of the Minnesota Vocational Test for Clerical 
Workers. 


ao 
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Summary 


Grades earned at the end of one term of stenography or typewriting 
studied at Hunter College of the City of New York were related to scores 
on a series of aptitude and interest tests in the case of 96 students taking 
typewriting and 75 students studying shorthand. Administered after 
the students had signed into the courses but before they had begun class 
work, the tests included Bennett’s Stenographic Aptitude Test, the Kuder 
Preference Record, the MacQuarrie Test for Mechanical Ability, the 
Minnesota Vocational Test for Clerical Workers, Strong’s Vocational 
Interest Blank for Women, Thurstone’s Vocational Interest Schedule, and 
the Turse Shorthand Aptitude Test. 

The number and name comparison scores from the Minnesota Voca- 
tional Test for Clerical Workers, the tracing, dotting and pursuit scores 
of the MacQuarrie Test for Mechanical Ability and the total scores from 
the Turse Shorthand Aptitude Test differentiated between good and poor 
typists. However, for practical purposes, the author concluded that the 
two scores from the Minnesota Vocational Test for Clerical Workers, and 
the tracing, dotting and pursuit scores of the MacQuarrie test provided 
satisfactory predictions for advising students interested in studying type- 
writing. 

Of the considerable number of tests which differentiated between good 
and average stenography students, the pursuit scores from the Mac- 
Quarrie test, the number comparison scores from the Minnesota test, 
and the transcription and phonetic association scores on the Turse Short- 
hand Aptitude Test proved a combination which provided the maximum 
possible predictions on the basis of the test data reported in this study. 


Received January 5, 1946. 
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Readability of Mixed Type Forms * 


Miles A. Tinker and Donald G. Paterson 
University of Minnesota 


Some newspapers, in order to produce a supposedly high degree of 
reader attention, introduce a medley of typographical arrangements on 
the same page or in different sections of the same feature article. News- 
paper editors refer to this as “‘change of pace.”’ 

A moderate degree of “change of pace” typographical arrangement 
may be the practice in most newspapers. Nevertheless it can be carried 
to an extreme in some cases. For instance, a feature story occupying 
about one-third of a page in the Sunday edition of a metropolitan paper 
was printed with the following variations in typography: ordinary 
Roman lower case, italics in both lower case and all-capitals, all-capitals, 
bold face in lower case, capitals and italics, different line widths, different 
type sizes and amounts of leading, and boxed-in material. Most varia- 
tions appeared several times, each time for a phrase, a sentence or a 
paragraph. A sample is shown in Figure 1. The justification for this 
kind of printing practice should rest upon experimental evidence and not 
upon the views of a particular editor no matter how experienced the 
latter may be. In addition to achieving certain reactions from the 
reader, the “change of pace” should also receive reader approval and the 
text should not sacrifice readability. 

The present study was undertaken to measure the readability of and 
reader preferences for two medley typographical arrangements in com- 
parison with straight-forward lower case Roman type. Readability was 
measured in terms of speed of reading and preferences were determined in 
terms of judged legibility and judged pleasingness. 

The reading material consisted of Forms A and B of the Chapman- 
Cook Speed of Reading Test. Although performance on Form B is in 
general equivalent to performance on Form A, a control group was intro- 
duced to check the equivalence. There were 30 paragraphs of 30 words 
each in each test form. Reading time allowed was 134 minutes on each 
form. 

The test forms were printed in the following typographical arrange- 
ments: (1) Form A and Form B were printed in 7 point Excelsior news- 


* Grateful acknowledgment is given to the Graduate School, University of Minne- 
sota, for research grant to finance this study. 
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paper type with a 12 pica line width and one point leading on newsprint 
paper stock. Form B was also printed in medley arrangement No. 1 
which involved the following typographical variations: (1) Ten point 
Roman lower case, 12 pica line width, 2 point leading: (2) Same as (1) 
with italic rather than Roman; (3) Seven point Roman lower case, 12 
pica line width, 1 point leading: (4) Same as (3) but in bold face; (5) Same 
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Fic. 1. Medley arrangement or “change of pace” newspaper layout which 
prompted the present study. 


as (4) but with 10}4 pica line width; (6) Same as (3) but in all-capitals 
rather than Roman; (7) Same as (6) but with 104 pica line width; (8) 
Same as (3) but in all-capitals bold face; and (9) Same as (3) but with 11 
pica line width and boxed in. 

In addition, Form B was printed in medley arrangement No. 2 with 
the following variations: (1) Ten point Roman lower case, 12 pica line 
width, 2 point leading; (2) Seven point Roman lower case, 12 pica line 
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width, 1 point leading; (3) Same as (2) but in bold face; (4) Same as (2) 
but in all-capitals; (5) Same as (4) but in bold face; (6) Ten point Roman 
bold face, 9 pica line width, 2 point leading; (7) Same as (6) but in a 10% 
pica line width; (8) Same as (6) but in italics; (9) Ten point all-capital 
italics, 9 pica line width, 2 point leading; (10) Ten point Roman bold 
face, 10 pica line width, 2 point leading, boxed in; and (11) Same as (10) 
but in itali¢s (not bold face). 

In medley arrangement No. 1 all the printing was in Excelsior news- 
print type face except the bold face which was in Memphis. Each vari- 
ation involved a phrase, a sentence, or one or two paragraphs with 
repetitions. In medley arrangement No. 2 there was greater variation 
in line widths and more frequent changes from one arrangement to an- 
other within a paragraph, and boxed in paragraphs were used. 

Three groups of 94 college students each served as subjects. Group 
testing in the classroom situation was employed. In each group, Form 
A was the standard. Each subject read the standard form first, followed 
by Form B typographically identical (control) or in one of the two 
medley arrangements. The order of presenting the test forms was 
systematically varied. See Tinker and Paterson ' for details of meth- 
odology. Analysis of the data will reveal the influence upon readability 
of the medley arrangements in comparison with the standard arrangement 

Data for the speed of reading measurements are given in Table 1. 
In Test Group I, the control group, the results show that 0.31 paragraph 
must be added to the mean score on Form B in each test group to estab- 
lish equivalence of the two forms. The data for Test Group II reveal 
that medley arrangement No. 1 was read 1.48 paragraphs slower than the 
standard arrangement. Similarly, as shown in Test Group III, medley 
arrangement No. 2 was read 2.00 paragraphs slower than the standard. 
The corresponding percentage differences are 8.35 and 11.39 respectively. 
The figures in column 10.show that these differences are statistically 
significant. As a matter of fact, our studies ? have revealed few non- 
optimal typographical situations which retard speed of reading by more 
than 8 per cent, and very few that retard reading rate by as much as 11 
per cent. The medley arrangements of type, therefore, produce a 
severe adverse influence upon readability of newsprint. 

There appear to be at least three factors which operate to retard rate 
of reading in the medley arrangements of type: (1) Our previous in- 


1 Tinker, M. A., and Paterson, D. G., Studies of typographical factors influencing 
speed of reading: XII. Methodological considerations. J. appl. Psychol., 1936, 20, 
132-145; also Paterson and Tinker, How to make type readable. New York: Harper and 
Bros., 1940 (can be obtained from the authors). 

? Paterson and Tinker, op. cit., 1940. 
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vestigations have shown that text in all-capitals seriously retards speed 
of reading and that material printed in italics retards reading slightly. 
Furthermore, readers have a strong aversion to reading text in either 
all-capitals or italics. (2) The shorter line width coupled with the larger 
size of type (10 point) interfere with effective perceptual habits in read- 
ing. That is, there are so few words per line that the important role of 
peripheral vision in speeding up perception is much less effective. When 
the line width is optimal, peripheral (and less distinct) vision of words 
along the line to the right of the fixation point gives premonitions of 
meanings and also guides the eye to successive fixations along the line. 
To eliminate effective use of peripheral vision, therefore, retards rate of 
reading. (3) The constantly occurring changes in typography (i.e., 
ordinary lower case to bold face to all-capitals to boxed-in material, etc.) 
are probably distrating to continuous and evenly sustained attention to 
meanings. This would also tend to retard speed of reading. 

For the preference study, the whole test of 30 paragraphs was mounted 
on cardboard (the standard, medley arrangement No. 1, and medley ar- 
rangement No. 2) and 181 readers ranked the specimens according to 
opinions of legibility and according to preference as to pleasingness. 
The results of the judgments for apparent legibility are shown in Table 
2. The average ranks reveal that the standard arrangement ( 7 point, 


Table 2 


Uniform and Medley Newsprint Arrangements Ranked According to 
181 Reader Opinions of Legibility * 








Type Variation Average Rank 8.D. Rank Order 


7 pt., 12 pica, 1 pt. leading 1.83 .93 1 
Medley Arrangement (1) 1.98 56 2 
Medley Arrangement (2) 2.19 87 3 


* See text for specifications of medley typographical arrangement. 








12 pica, 1 point leading) was judged most legible, medley arrangement 
No. 1 next, and medley arrangement No. 2 poorest. This is the same order 
as for readability measured in terms of speed of reading. Note, however, 
that the differences between mean ranks are not large. 

The results for judgments of pleasingness are listed in Table 3. Here, 
medley arrangement No. 1 was considered most pleasing, the standard in 
7 point type was ranked next, and medley arrangement No. 2 was least 
pleasing. Again the differences between the standard and medley ar- 
rangement No. 1 were not large, but medley arrangement No. 2 was well 
separated from the other two kinds of printing. For medley arrangement 
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Table 3 
Uniform and Medley Newsprint Arrangements Ranked According to 
181 Reader Opinions of Pleasingness * ¢ 
Type Variation Average Rank 8.D. Rank Order ‘ 
7 pt., 12 pica, 1 pt. leading 1.86 89 2 r 
Medley Arrangement (1) 1.70 52 1 
Medley Arrangement (2) 2.44 80 3 





* See text for specifications of medley typographical arrangement. 


No. 1, therefore, the ranks for pleasingness do not agree entirely with the 
ranks for judged legibility nor for the speed of reading results. 

Although in earlier studies we have found a few differences between 
preferences and readability,’ the judgments for legibility and for pleasing- 
ness have tended to agree to a marked degree.‘ The discrepancies in this 
study are difficult to interpret. Apparently readers consider that some 
variety in typographical arrangement is desirable from the viewpoint of 
pleasingness even though they consider such variation to be somewhat less 
legible than uniform typography. ‘The readers who expressed the pre- 
ferences have been exposed daily to a considerable amount of typo- 
graphical variation in the local newspapers. Familiarity with such 
typography may have developed either a tolerance to or a liking for 
medley arrangements. 

Is the practice of introducing a medley of typographical arrangements 
in newspaper printing based upon sound principles? On the one hand 
we find that the medley arrangements severely retard speed of reading 
and presumably ease of reading. Readers consider the medley arrange- 
ments less legible than uniform typography. On the other hand, readers 
who have been exposed to the practice judge the milder degree of medley 
arrangement to be slightly more pleasing than uniform typography, but 
dislike severe degrees of ‘“‘change of pace” arrangement. Added to this, 
there are other reader reactions not measured in this study which may be 
either favorable or unfavorable. In making his decision concerning the 
use of medley arrangements, the editor should balance the factors of 

poor readability plus readers’ unfavorable opinions in regard to readability 
versus whatever advantages are known to accrue from their use. It may 
be difficult to compensate for a loss of 8 to 12 per cent in readability and 
adverse reader opinions on legibility by other alleged advantages which 
may or may not be present. 


* Paterson and Tinker, op. cit., 1940. 
* Tinker, M. A., nada apa ht G., Reader preferences and typography. J. appl. 
Psychol., 1942, 26, 38~40. 
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Readability of Mixed Type Forms 


Summary 


1. The purpose of this investigation was to determine the readability 
of mixed type forms. 

2. The speed of reading 7 point Excelsior newsprint in a 12 pica line 
width with one point leading was compared with the speed of reading two 
medley arrangements of newspaper type. 

3. Medley arrangement No. 1 was read 8.35 and medley arrangement 
No. 2 was read 11.39 per cent more slowly than the 7 point Excelsior in 
uniform arrangement. This amount of retardation is serious and is 
seldom shown in non-optimal typography. 

4. The slower rate of reading the medley arrangements is apparently 
due to several factors: (a) the slower rate for reading text in all-capitals, 
in italics and in non-optimal line widths, and (b) the possible distraction 
produced by frequently shifting from one typographical arrangement to 
another. 

5. Judged legibility was in line with readability measurements. The 
7 point newsprint in uniform arrangement was judged most legible, 
medley arrangement No. 1 was next, and medley arrangement No. 2 was 
rated least legible. 

6. Medley arrangement No. 1 was rated most pleasing, the uniform 
7 point text was next and medley arrangement No. 2 was a poor third. 
The difference in average rank between the first two, however, was not 
large. Apparently these readers tended to consider some variation in 
typography as more pleasing even though they judged such variation to 
be less legible than uniform typography. 

7. In deciding to employ a medley arrangement in newspaper print- 
ing, the editor should consider whether certain alleged advantages more 
than compensate for the severe loss in readability and the adverse opinions 
of readers. 


Received December 14, 1945. 








Recombination of Ideas in Creative Thinking * 


Livingston Welch 
Institute for Research in Clinical and Child Psychology, Hunter College 


Creative thinking or imagination is rated by many of the standard 
projective tests, such as the Rorschach test and the Thematic Appercep- 
tion test. L. L. Thurstone and J. J. O’Connor and others have, in fact, 
devised special tests for this ability. So many factors, however, are in- 
volved in creative thinking that it seems desirable to study one, at least, 
which may be essential to all types. In this study it has been assumed 
that the ability to readily recombine or reorganize ideas according to 
some specific pattern is essential to all types of creative thinking, whether 
it be painting a landscape, inventing some new scientific instrument, or 
composing a new advertisement. 

The recombination of ideas per se is common to mental activity. which 
in the strict sense of the word we would not call creative thinking. Ideas 
in dreams are recombined to form images and ideas that we have never 
seen or thought of before. The motivation may be explained in terms of 
wishfulfilling, but there is no set plan or scheme involved in these recomb- 
inations. Such mental activity we might call phantasy. On the other 
hand, the creative artist or scientist recombines ideas in an attempt to 
achieve some goal or to solve some problem. 

In an attempt to observe the part that the ability to recombine ideas 
according to plan plays in creative thinking, a test was constructed in 
which the subject was obliged to recombine familiar ideas according to 
four different patterns. The test was then given to a group of college 
juniors and seniors and to a group of professiqnal artists. We were not 
pretending to test artistic ability alone. We did assume, however, that 
one of the many characteristics which are essential to artistic ability is 
the ability to recombine ideas quickly according to a plan and that where 
this ability was found to be sadly lacking, creative thinking might be 
significantly limited. We were willing to admit that one artist might be 
considered much greater than another and still the first artist might be 
slower in recombining ideas. In such an instance the superiority of the 
first artist could be explained in terms of the many other characteristics 
which he possessed; yet, despite these differences, we did expect the pro- 

* The author is indebted to Dr. Louis Long, Miss R. M. Thomas and Miss Lee 
Clarke for their aid in constructing and administrating the test. 
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fessional artists as a group to be more successful on this test than the 
average intelligent layman represented by the college group. 


Procedure 


The test was divided into four parts. The first three sub-tests made 
use of written material and the fourth made use of blocks. The total 
testing time was 26 minutes. 


Part I. Instructions 


Recombine the words of each group on the next page to make as many 
meaningful, grammatical sentences as possible. For example, here is a group 
of ten words. 

MEN SKY IS FIGHT THAT THE SLOW BRIGHT OF FOR 
which can be recombined into the following sentences: 


Men fight for the sky. 
The sky is bright. 
The fight is slow. 

Etc. 


You will receive as much credit for a short sentence as for along one. Your 
sentences do not have to be artistic, but they must be grammatical. There 
must be at least a subject and a predicate. You will receive credit for a 
sentence which is only slightly different from another. A word from the group 
can be used only once in the same sentence, but it may be used any number of 
times in other sentences. Only use words from the group that you are exam- 
ining at the time. You may skip from one group to another, if you like. 

ere are ten of these groups and you have eg ten minutes in which to 
complete the test. Are there any questions? . . . Do not turn the page until 
the examiner says “Start.” 

The following are the ten groups of Part I. 


weg pk CLIMBS RUNS THOSE A SMOOTH GOOD 
me he! BUILT STOOD A THAT LARGE STRONG 
° a TRAVELS WAS THIS THAT BIG COOL 

. SEA WOMAN MOVE COULD THESE THE GREEN 
ROUGH WITH OF 

. DEN LION ATE IS BIG DEEP THESE THE OF BY 
. seg A a LEFT HAS BLUE FRIGHTENED THE A 


LEMON WIFE COOKS FINDS THAT SOFT ROUND 
WITH FROM 


POTATOES MAID CUT ONCE SMALL HOT THESE A 
OF FOR 


: a al WAITS CATCHES THE A LONG COLD BY 


. SLOWLY THE GOLDEN LIGHT THAT RESTED UPON 
THEM MOVED AWAY. 


Part II. Instructions 


Make as many letters as B owen using no more and no less than three 
straight lines. For example, the letter A is made with three straight lines, two 


oOo NON Ft Ww NH 


= 
o 
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slanting downward and one across. You will be given no credit for the letter fr 

A, since it is an example. tk 
Make as many letters as possible, using no more and no less than two pe 


straight lines. 
ake as many letters as possible, using no more and no less than one straight 
line and one semi-circle. Ww 
The time limit is three minutes. 


Part III. Instructions 


On the next page you will be given a list of twenty words which you are to 
connect into a story. You must be certain to use the words in the order in 
which they appear on the list. If the first word on the list is “house” and the 
second word is “tree,” you must first make use of the word “house” in your 
— and then make use of the word “tree.” You must not skip any of the 
words. 

Your story must be grammatical and logically related. It must have a 
beginning and an end. You will be rated on the number of words you make 
use of in the time allotted. Write as fast as you can and underline each of th 
twenty words as you use it. 

The time limit is three minutes. 

The words used in this test were: 

STAIRS OCEAN CHEMISTRY SONG TEST MOUNTAIN 
BUBBLE DOG LEMON PICTURE POST BLANKET VIOLIN 
LAMP NIGHTMARE STEAM LEG WINDOW SWAMP STAMP. 
(The words were given in this order.) 


Part IV. Instructions 


The object of this test is to construct out of ten blocks on each trial, as 
many pieces of furniture or home furnishings as possible. The piece of furni- 
ture ong construct must fit properly. It must be symmetrical and be recog- 
nizable as a piece of furniture. Do not attempt to be futuristic. Use con- 
ventional forms. You must use a minimum of two blocks to construct a piece 
of furniture. You can use the same block over again to make another piece of 
furniture. You can make as many of the same type of furniture as you like. 
You will receive full credit for the same type that is only slightly different 
from another. 

You have only ten minutes to complete this test. There are five trials. 
Hence, you have only two minutes for each trial. 


In Figure 1, the forms of the ten blocks of one of the trays are presented. 
The blocks used in all five trials were similar geometric shapes selected 


1 The word “grammatical” as used in the instructions of Part I and Part III of this Pp) 
test simply means adhering to the standard rules of grammar. If the subject writes a Ir 
sentence free of any errors of grammar, he obtains full credit for this sentence. The 
subject is given one small liberty,—the omission of the article before the subject or 
object, e.g. “Boy meets girl.” 

The phrase “logically related” in the instructions for Part III concerns the con- 
tinuity of the story which must have a beginning and an end. An example of what is 
meant by logically related would be the following sentence, “The stairs led down to ' 
the ocean.” An example of a lack of logical relationship would be, “I walk down the | 
stairs. Last summer I took an ocean voyage. John studies chemistry.’ These three ré 
sentences are not descriptions of the same event. As long as the sentences are parts of 1! 
the description of the same event, the subject obtains full credit. ( 
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from a box of playing blocks. On each trial the blocks were presented to 
the subject on a piece of cardboard with each shape outlined so that the 
positions of the blocks were standardized. Wei. 


A record was kept of all of the combinations of blocks for which credit 
was given. 








eo 


( 
















































































Fie. 1. The blocks used on one of the five trials of Part IV. 


Subjects 


The test was given to a total of 78 subjects, 48 college students and 30 
professional artists. The students ranged in age from 18 to 29 with a 


mean age of 20. The artists ranged in age from 22 to 56 with a mean 
age of 37. 


Results 


The scores of the college group ranged from 23 to 56 with a mean of 
37.6 and a standard deviation of 7.00, while the scores of the artist group 
ranged from 39 to 89 with a mean of 60.5 and a standard deviation of 


12.26. The difference between the means was statistically reliable: 
(D/e = 9.3). 
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The types of artists are classified as follows: 1 art director, 4 art 
teachers, 4 cartoonists, 2 fashion artists, 1 furniture designer, 5 illustrators 
5 layout artists, 1 medical artist, 1 mural painter, 5 portrait and land- 
scape painters, and 1 sculptor. 

















Table 1 
The Means for All Four Parts of the Test 

College Group ist Grou Critical 
: N = 48 N = 30 9 Ratio 

between 

Parts Mean §.D. Mean 8.D. Means 
I 18.0 4.24 17.7 7.15 0.2 
II 6.7 1.83 12.5 1.93 13.2 
Ill 9.1 3.15 11.4 4.09 1.1 
IV 3.4 2.68 18.4 7.78 10.2 





The Wonderlic Personnel Test was given to 48 of the college students 
and it was found that the correlation between the scores on this test and 
the test for the recombination of ideas was .27. No significant correlation 
was found between performance on the test for the recombination of 
ideas and chronological age. This indicates that the superior perform- 
ance of the artists could not be explained in terms of their added years of 
experience. 

There is no adequate way of rating artistic ability as in the case of 
academic achievement. We were fortunate enough, however, to have 
the opportunity of testing five commercial artists from the same ad- 
vertising company and were able to compare the scores on the test with 
the company’s opinion of the creativity of these subjects. The layout 
artist, who was considered the most imaginative, received the highest 
score (88), while the two fashion artists who were considered the most 
unimaginative received the two lowest scores (44 and 46). 

The one furniture designer who took this test received a score of 47, 
which was low for the artist group. On Part IV of the test, which has to 
do with the construction of furniture, this person only made a score of 15. 
(The highest score was 35 points.) 


Summary and Conclusions 


In this study we have assumed that one essential element of any type 
of creative thinking is the ability to recombine ideas readily according to 
a pattern or plan. It is recognized that the creative artist has many 
highly developed abilities, but in this experiment only his ability to re- 
combine ideas efficiently and quickly was tested. The test was divided 
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into four parts and the materials used were as familiar to the layman as 
they were to the artist. For example, the subject was required to re- 
combine familiar words into different sentences and to recombine blocks 
into symmetrical pieces of furniture. No skill or training is needed for 
this test that would not be well developed in grammar school. 

The test was given to 30 professional artists and to 48 college students. 
The mean score of the artist group was much higher than that of the 
college group. 

Part II (making letters out of certain types of lines) and Part IV 
(making pieces of furniture out of blocks) differentiated the professional 
artists from the non-artists. Part I (composing sentences from certain 
word groups) and Part III (writing a story using a list of dissociated 
words) failed to differentiate significantly between the two groups. 

The correlation between scores on this test and scores on the Wonderlic 
Personnel Test was very low. Moreover, chronological age did not 
appear to affect the scores. 


Received January 14, 1946. 





Questionnaire and Interview in 
Neuropsychiatric Screening * 


Daniel H. Harris 
Veterans Administration, New York City 


The combination of a “neurotic” questionnaire plus individual ap- 
praisal in the weeding out of potential neuropsychiatric casualties from 
the flood of incoming recruits and inductees showed gratifying results 
during the last couple of years of World War II. 

In the early days and months after Pearl Harbor, the NP screening 
of recruits by interview alone resulted in fantastic overworking of the 
few psychiatrists and psychologists available at the time to the armed 
services. Due to this overwork, it is likely that many misfits slipped 
through. 

On the other hand, the procedure of rejecting men solely on the basis 
of a questionnaire does not appear to have been adopted at any time. 
However, the use of such an instrument as a preliminary coarse filter to 
separate out those to be interviewed, of whom a fraction will be finally 
rejected, has been found to be effective and time-saving (1), (3), (4). 

In the writer’s opinion, this fact has large possible implications for 
vocational and any other kind of selection involving large groups, in 
addition to its demonstrated value for military selection in peace or war. 


Procedure 


The present findings are based on the use in the manner indicated of 
an instrument developed at the Newport Naval Training Station. It 
consists ‘of two parts: a 32-item condensation of the Cornell Selectee 
Index, Form N; and the Brown University Personal Inventory, Format 
C. Neither of these inventories has been released for general use, but 
results based on one or both have been published or referred to (1), (2), 
(3), (4). There have been no previously reported results based on Con- 
struction Battalion (Seabee) subjects. 

Between Feb. 1 and May 2, 1945, a total of 2,081 Seabee recruits were 
received for training at the Naval Construction Training Center at 
Davisville, R. I. The training period was of ten weeks’ duration. 


* The opinions and assertions contained in this paper are the private ones of the 
writer and are not to be construed as official or as reflecting the views of the Navy 
Department or of the Naval Service at large. 
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On arrival at the Training Center, each recruit filled out the two-part 
questionnaire. Administration was in groups of 50 to 100. After simple 
oral directions, it was usually completed in five or six minutes, and scoring 
by hand stencil took about 20 seconds per paper. Enlisted personnel 
aided in the administration and did the scoring, under supervision of the 
psychologist. 

Using the “cutting” scores as used at Newport, 297 men were screened 
out by the questionnaire as possible NP casualites; the other 1,784 re- 
cruits were passed through by the questionnaire and went right on to 
duty without interview. 

Each of the 297 men screened out was interviewed very briefly by the 
psychologist, usually within an hour or two of the questionnaire admini- 
stration. These interviews averaged not over three minutes apiece; 
following which the man was: Sent on to full duty (NV = 203); or Referred 
to the psychiatrist (NV = 52); or Sent to trial duty (N = 42). 

Of the 203 men sent to full duty, three were later referred to the NP 
department by their commanding officers during their ten-week training 
period. ‘They were seen by the psychiatrist, whose ultimate disposition 
was: NP discharge from the Navy (N = 2); and Return to duty (N = 1). 

Of the 52 men referred to the pyschiatrist, four received medical or 
surgical discharges from the service before any NP disposition could be 
made. The psychiatrist made the following disposition of the remaining 
48 men: NP discharge from the Navy (N = 37); and Return to duty 
(N = 11). 

The 42 men sent to trial duty were called back for a brief re-interview 
after two or three weeks. This was generally even shorter than the first 
interview. Usually a written paragraph of appraisal of his adjustment 
to military service so far was available for each man, from his chief petty 
officer. In two cases the man had received a medical or surgical dis- 
charge before the re-interview could take place. Of the remaining 40 
men, 26 were sent back to full duty following the re-interview, and none 
of these came again to the attention of the NP department. The other 
14 were referred to the psychiatrist, whose ultimate disposition was: 
NP discharge from the Navy (N = 7); NP transfer to Naval Hospital 
(N = 1); and Return to duty (N = 6). 

Of the 1,784 men who were not screened out by the questionnaire and 
so received no further NP attention on arrival, 23 were later referred to 
the NP department by their commanding officers during their training 
period. Their ultimate disposition by the psychiatrist was as follows: 
NP discharge from the Navy (N = 15); NP transfer to Naval Hospital 
(N = 1); and Return to duty (N = 7) 

The initial and final disposition of all of the 2,081 recruits is shown in 
Table 1. 





Daniel H. Harris 
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Discussion 

It is to be noted that the total number of “false positives’’—i. e., men 
screened out by the questionnaire but immediately or eventually found 
fit for duty—was 244. This is 11.7% of the total number of recruits. 
At Newport (4) with general service personnel, the “false positive” rate 
was about 25%. It is probable that the difference is accounted for by 
the differing group characteristics of the two populations. Seabee in- 
ductees were in general older, occupationally more skilled, and undoubt- 
edly differed in other, at present, non-demonstrable ways from the general 
service inductees taken in at a Naval Training Station. 

Also worth mentioning are what might be called the ‘false negatives”’ 
—i.e., those who were passed through by the questionnaire but were later 
adjudged to be NP casualties after referral by. commanding officers during 
training. As shown above, these numbered 16. This comes to 0.9% 
of the 1,784 men passed through by the questionnaire, and constitutes 
25% of the total number of NP rejections. 

As mentioned previously, the ‘‘cutting’’ scores here used were those 
used at Newport. They were: a score of 9 or more on the Selectee Index 
condensation, and/or a score of 1 or more on the Personal Inventory 
Format C. Analysis of the data to see what would have been the effect 
on the “false positive” and “false negative” rates of lowering the cutting 
score on the Selectee Index reveals that the following would have hap- 
pened, with cutting scores ranging from the used score of 9 down to a 
score of 3: 


Table 3 








False Negative Rate 


% of Men % of Total 
Passed by NP 
Questionnaire Rejections 








0.9 25.4 
0.8 23.8 
0.7 20.6 
0.7 20.6 
0.6 17.5 
0.5 14.3 
0.4 12.7 


orannw eo 





It can be seen that the “false negative’’ rate does not go down as fast 
as the “false positive” rate goes up. However, in a situation where one 
is solely interested in cutting down the “false negatives’ without caring 
how many perfectly good “false positives” he lost in the process, the 
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cutting score could be set so low that “false negatives” might be practic- 
ally eliminated. This can be done only when there is a rather extrava- 
gant surplus of available manpower; but in this way it might be possible 
to eliminate practically all potential NP casualties by a completely auto- 
matic procedure involving no interviewing. 


Summary 

1. Of 2,081 Seabee recruits, 63 became NP casualties during their 
ten weeks’ training period. 

2. Forty-seven (75%) of these NP casualties were screened out by a 
5-minute group questionnaire administered on arrival at the Training 
Center. 

3. Of 203 men screened out by the questionnaire as possible NP 
casualties but sent to duty the same day by the psychologist after brief 
interview, 2 became NP casualties during training. 

4. The combination of screening questionnaire and interview should 
be valuable for almost any kind of selection involving large groups. 

5. By setting the “cutting’’ score low enough it may be possible under 
some conditions to select effectively by means of a questionnaire alone, 
without interview. 


Received December 6, 1945. 
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Restandardization of the Revised Beta Examination 
to Yield the Wechsler Type of IQ * 


Robert M. Lindner 
Haarlem Lodge, Catonsville, Maryland, 
and 
Milton Gurvitz 
Hillside Hospital, Queens, New York 


The Army Group Examination Beta developed during World War I 
has led to a considerable number of revisions and similar tests designed to 
permit measurement of illiterates, of persons who do not speak or read 
English, and of other individuals for whom a verbal test is not considered 
suitable. Perhaps the most important revision of this test is that by 
Kellogg and Morton! published in 1934 and entitled “Revised Beta 
Examination.” 

For many years this test has been used extensively. Its most common 
application apparently has been in penal institutions where it has been 
found useful for purposes of initial classification of committed persons. 
The scores have been found significant in relation to the psychiatric, 
educational, and vocational adjustment of persons in these institutions. 
A second major use of the test has been in selection and classification of 
employees in mass industries. The publisher reports that the test is 
usually sold in large quantities to institutions and large industries, al- 
though it is apparently used quite generally in small quantities for a 
variety of educational, vocational, and counseling purposes. 

Because of its general usefulness, the authors were interested in im- 
proving the administration and standardization of the test. This 
opportunity arose because of the large number of cases which was avail- 
able to them at the United States Federal Penitentiary at Lewisburg, 
Pennsylvania. Information was available concerning the subjects used 
which permitted a standardization to be made according to modern 
methods of equating the sample to the general population. A summary 
of the sampling procedure will be discussed below. 


*This paper is based upon a more extensive unpublished report of the research. 
Dr. Harold Seashore of The Psychological Corporation assisted materially in its prep- 
aration. This is a “prior publication,” authors paying costs. 

1 Kellogg, C. E., and Morton, N. W. Revised beta examination. The Psychological 
Corporation, 1935. Also see revised beta examination. Personnel J., 1934, 13, 98-99. 
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In addition to planning a general restandardization there arose the 
question of converting the scoring and interpretation of the test into a 
more usefulform. It was decided to follow the general method developed 
by Wechsler ? in the standardization of the Wechsler-Bellevue Intelligence 
Scale. 

The main features of the Wechsler type of scoring and standardization 
are, first, that each of the subtests is converted into scaled scores so that 
a profile is secured of the subtests and, second, that the computation 
of the IQ takes cognizance of the fact that mental ability as measured 


by the test declines with age after a peak of development in the early 
twenties. 


Changes in Administration and Scoring of the Subtests 


Minor changes have been made in the administration of some of the 
subtests, and certain of the scoring procedures for the subtests have been 
made more explicit and objective. There has been no change in the 
content of the test. These changes are of no interest at this moment; 
they will be presented in the revised manual which the publisher is 
making available. 


Weighted Scores for the Subtests 


At the present time the various subtests of the Beta Examination 
contribute differentially to the total and apparently there has been no 
demonstration that this is the optimum weighting. It can be assumed 
that each test is as good as any of the others. While strictly speaking 
this is not true, it is probably true enough to make any further attempt 
at refinement unwarranted. The authors therefore decided that the 
scoring should be arranged so that each subtest would contribute equally 
to the total score. The plan has two advantages. It allows an examiner 
to prorate a score when for some reason or another one or two of the sub- 
tests have to be omitted. In addition it may provide a valuable clinical 
tool to be used in screening out psychiatric deviates who frequently 
express their personalities by unequal performance on a battery of six 
subtests. Wechsler has presented considerable evidence along this line 
for his test and it is hoped that when this new standardization becomes 
more widely used, similar information regarding “scatter” can be de- 
veloped for the Revised Beta Examination. 

Each subtest is made to fit a 20-point scale with a mean of 10 and a 
standard deviation of 3. Using Hull’s method,* each raw point score of 

* Wechsler, D. The measurement of adult intelligence. Baltimore: The Williams & 
Wilkins Co., 1941. 

* Hull, C. L. Aptitude testing. Yonkers, N. Y.: World Book Co., 1928, p. 397 ff. 
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each subtest is equated to the new scale. Tables of weighted scores are 
provided in the new manual for the test. This weighting was accom- 
plished by taking 1,006 heterogeneous test papers that were available. 
Several other criterion test scores were available, and the educational 
level of these cases was known. When these criteria were correlated with 
the raw scores and then with the newly designed weighted scores, there 
were only small changes in the size of the coefficient of correlation. Be- 
cause these correlations may be of some general interest, they are pre- 
sented in Table 1. 


Table 1 


Correlation Coefficients of Beta Raw and Weighted Scores 
with Other Variables (NV = 1,006) 








Beta Raw Score Beta Weighted Score 
(Kellogg & Morton) (Lindner & Gurvitz) 
T 


r 





U. 8. Public Health Service 

Classification Test Weighted Score .90 
U. 8. Public Health Service 

Classification Test IQ 86 
Last School Grade Completed 61 
Stanford Achievement Test, 

Paragraph Meaning 75 
Stanford Achievement Test, 

Word Meaning ' .74 
Stanford Achievement Test, 

Arithmetic Reasoning : 72 
Chronological Age — .30 





The Selection of the Standardization Sample 


The research is based upon an original collection of Beta scores on 
more than 2,000 cases. This sample could not be considered a proper 
one upon which to standardize a test without further inquiry into its 
composition. It was soon discovered, as Wechsler had found, that the 
mean weighted score declined steadily when the persons in the sample 
were grouped in five-year intervals of age. This demonstrated clearly 
that the Beta Examination was functioning much as Wechsler’s test and 
that therefore a general application of his method of computing the IQ 
might prove desirable. Before proceeding, however, further refinements 
of the sample were necessary. 

All psychotics and extremely physically handicapped individuals were 
removed. The standardization, furthermore, was limited to male adults. 
Negroes were removed from the population. It was found, for instance, 
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that age for age the negroes had an average of about one and one-half 
fewer years of education than whites. With these preliminary refine- 
ments, a sample was secured of 1,800 white male adult prisoners ranging 
from 17 to 70, including almost every previous occupation, and having 
origins in all the states east of the Mississippi River and most of the 
Western States. 

The next decision was to select from this group individuals who would 
be distributed in an educational grouping similar to that shown by the 
1940 Census. Furthermore, this distribution was to take cognizance of 
age. A sampling was done in such a way that within each age grouping 
of 5 or 10 years in range, the individuals would be distributed educa- 
tionally in proportion to the distribution of white, male adults in these 
same age ranges in the 1940 Census. The third variable in the selection 
process was the socio-economic status of the individuals, which also was 
equated to agree with the report of socio-economic groups, by age, in the 
1940 Census. 

The details of this sampling process are too elaborate to justify 
presentation in this report. The protocols are available from the junior 
author and can be supplied to anyone interested in the method. The 


Table 2 


Means and Standard Deviations of Weighted Scores of Standardizing Sample by Age 
Groups, and Corrected for Education and Socio-Economic Status (N = 1,225) 











Age N M SD 
16-19 85 65.0 12.1 
20-24 220 66.9 11.9 
25-29 195 65.0 12.9 
30-34 197 62.1 14.1 
35-39 200 58.7 14.9 
40-44 90 55.3 15.9 
45-49 83 52.0 16.8 
50-54 80 48.7 18.2 
55-59 75 45.6 19.4 





combined technique of the whole process of taking into consideration age, 
education, and socio-economic status resulted in the sample described in 
Table 2. This table presents the number of cases and the means and 
standard deviations of the weighted scores of the final standardizing 
sample, by age groups, and as corrected for education and socio-economic 
status. It will be noted that the mean weighted score declines and that 
the standard deviation increases steadily with increasing age. The 
smoothness of these lines indicates the quality of the sampling procedure. 
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Restandardization of Revised Beta Examination 


REVISED BETA WECHSLER - BELLEVUE 
WCIGMTED SCORE | WEIGHTED SCORE 


REVISED SETA UNSMOOTHED SCORE 
WECHSLER - BELLEVUE 
WECHSLER - BELLEVUE 





i 1. 
T mo + ’ tT 
Ace 20-26 25-29 30-34 35-29 40-44 





Fie. 1. Variation in means of weighted scores by 5-year age ranges. 


A comparison of the Wechsler-Bellevue and the Revised Beta curves of 
mean score against age is shown in Figure 1. 


The Derivation of Norms 


Wechsler’s method for calculating IQs has been followed in detail. 
The data for converting weighted scores for each subtest to IQs for each 
age range are provided in the new manual and need not be repeated here. 
Table 3 presents evidence that the conversion of weighted scores to IQs 
by the new table has accomplished the purpose desired. Observe that 
both the mean IQ and the standard deviation of the IQ are relatively 
constant from age range to age range. 


Table 3 


Means and Standard Deviations of the IQs for Each Age Range of the 
Beta Standardizing Group 








Age Mean SD 





20-24 100.1 
25-29 99.8 
30-34 99.8 
35-39 99.7 
40-44 100.8 

99.8 
50-54 99.4 











' OO _L 
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Correlation of Beta and Wechsler-Bellevue IQs 


One hundred and sixty-eight cases were tested with the Revised Beta 
Examination and with the Wechsler-Bellevue. The new scoring pro- 
cedure resulted in a coefficient of correlation of .92 between IQs on the 
two tests. The mean difference between the IQ scores, irrespective of 
sign, is 7.2. There is a general tendency for the Revised Beta Examina- 
tion to result in somewhat lower IQs for the same individuals, a tendency 
which can be regarded as desirable since it is known that the Wechsler- 
Bellevue is not so discriminating at the lower levels of ability. It is at 
the lower levels that Beta is particularly useful. On the other hand, 
the Beta Examination is not suitable for measuring IQs above about 120 
or 125. It is also likely that both tests lose some discriminative sensi- 
tivity above the age of 40. 

Figure 2 is presented to compare Beta and Wechsler IQs at different 
1Q levels at different ages. Each curve plots the IQ assigned to persons 
of a given age who have a weighted score which would yield the stated 
IQ at age 20-24. For instance, a person with a Beta IQ of 65 at age 20- 
24 would have a Beta IQ of 92 if he were age 50-54. The same person 
would have a Wechsler IQ at age 50-54 of 81. 

One should remember that the IQ tables for these tests are anuteebted 
so as to yield an average IQ of 100 for each age group. The steepness of 
the curve in Figure 2 then indicates the correction for age which was 
necessary to make the IQs equal at different ages. Several observations 
can be made from these figures. 


1. In general, the Beta test tends to require a greater correction 
for age as shown by greater steepness in most of the curves. 2. Both 
tests need more correction for age at lower IQ levels as indicated by the 
greater steepness of the curvesforlowerIQs. 3. The Beta IQ and Wechs- 
ler IQ curves are more discrepant at the lower IQs. No one knows 
whether the Beta or Wechsler data are better descriptions of the effect 
of senescence on ability. The general trends for the two tests are the 
same and the differences are doubtless due to the content of the tests 
and perhaps to the method of administration. 4. If an older person is 
assigned a Beta IQ of 100 and a Wechsler IQ of 100, the difference in 
actual test performances from that of 20-year olds is more on the Beta 
than on the Wechsler. 


Sex and Race Differences 


The question of whether the newly devised norms are suitable for 
females can only be answered on a theoretical basis as no women subjects 
were available to the authors for the standardization. The Beta has been 


we 
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Fie. 2. IQ variation with age for a weighted score giving a 
specific IQ at age 20-24. 
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used for testing women as well as men for many years and the authors 
know of no report indicating that different norms were required for 
different sexes. A study of the census data indicates that there is no 
appreciable difference between the educational status achieved by native 
white males and native white females. Furthermore, there is no other 
intelligence test that we know of that uses a separate scale for men and 
women. If the test is to be used in a large-scale program where separate 
norms can be obtained, it might be desirable for someone to make an 
experimental study of this problem. 

The authors have considerable evidence at hand to indicate that the 
difference in performance on the Beta Examination between whites and 
nonwhites is about roughly proportional to the difference in mean number 
of grades completed. This, of course, is not adequate evidence to suggest 
that the norms of the Revised Beta would be applicable to nonwhites. 
They made several other attempts to find a way of extending the stand- 
ardization to make meaningful scores for negroes. This plan was aban- 
doned for several reasons, but primarily because the sampling of negroes 
produced norms that seemed to exaggerate the differences between urban 
northern-born negroes, and rural southern negroes. 

Psychologists should be thoroughly acquainted with the literature 
of the effects of culture and education on tests, and it is assumed they 
will have this background information when they are evaluating IQs 
on this test or any test when individuals from atypical cultures are being 
measured. 


Suitability to the General Population 


This test was standardized on adult, white, male prisoners. Can 
such a standardizing population produce norms applicable to the general 
population? Several considerations appear. The standardization does 
not reflect a prison population when it utilizes a samplirg based upon 
educational and socio-economic standards determined by the 1940 Census. 
New residents of the Federal Penitentiary at Lewisburg were tested within 
one week of their entrance into the institution, a fact of importance since 
experience shows that incarceration for a longer period of time can develop 
stereotyped modes of thought and expression. It should also be pointed 
out that this penitentiary does not receive many established criminals; it 
is an institution for adults who are considered improvable offenders. 
They are essentially not criminals but lawbreakers. 


The Classification of Intelligence 


IQs as calculated by this revision of the Revised Beta Examination 
must be recognized as relative indices of the degree of intelligence. IQs | 
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determined by this method should always be labelled “Beta IQ”’ and not 
simply “IQ.” In interpreting the score one should also be aware of the 
fact that a Beta IQ of 70 in an older person is different from a Beta IQ 
of 70 in a younger person with respect to the performance on which it is 
determined. The authors would like to propose a method of reporting 
whereby both the IQ and the weighted score will be interpreted. Tables 
4 and 5 present these two modes of classification. 


Table 4 


IQ Classification * 
(Based on Weighted Scores and Age) 








IQ 


Classification 





129 and up 
120-128 
110-119 
90-109 

80- 89 

71-— 79 

70 and below 


Very Superior 
Superior 
Above Average 
Average 

Below Average 
Inferior 
Defective 





* Classification system same as that used by Wechsler. 


Table 5 


Weighted Score Classification 
(Without Regard to Age) 


Weighted Score Classification 
90 and above A 


83-89 
75-82 
59-74 
61-58 


B 
C+ 
C 
Cane 


43-50 D 
42 and below E 


Intercorrelations of Subtests 


The intercorrelations of the subtests of the Revised Beta Examination 
as administered and scored by the procedure reported by the authors in 
the new test manual are given in Table 6. 


Summary 


1. The Revised Beta Examination has been restandardized to ac- 
complish three purposes: (a) The administration and scoring procedures 
have been improved. (b) The sample of adults upon which new norms 























658 Robert M. Lindner and Milton Gurvitz 
Table 6 
Intertest Correlations of the Subtests and Weighted Score (NV = 1,006) 
Error Picture 

Weighted Digit Recog- Form- Com- _ § Iden- 

Score Maze Symbol nition Board pletion tities 
Weighted Score — 68 86 82 75 83 .78 
Maze .68 — 62 51 52 55 54 
Digit Symbol 86 62 — .60 57 67 .72 
Error Recognition 82 51 .60 — .74 .76 58 
Form-Board 75 52 57 .74 62 51 
Picture Completion 83 55 67 .76 62 — 56 
Identities 78 54 .72 58 51 56 —_ 
Average .79 57 67 67 62 67 62 





are based has been selected to represent the 1940 Census with respect 
to education and socio-economic status within several age groups from 
twenty years and above. (c) The standardization permits the securing 
of Beta IQs which are similar in meaning (though not necessarily in size) 
to the IQs secured on the Wechsler-Bellevue Intelligence Scale. 

2. The procedures for the selection of the sample are briefly reviewed, 
and the authors feel that they have standardized the test on a sample of 
white, male adults above the age of 20, which is representative of the 
general population. 

3. The method of weighting the subtests and determining the IQs as 
worked out in this research resulted in average IQs and standard devia- 
tions of the IQs which are equivalent for various age ranges. The 
average weighted score showed a steady decline with age similar to that 
shown in Wechsler’s research. 

4. It is believed that the Revised Beta Examination, when admin- 
istered and interpreted according to the procedures outlined in the new 
manual for this test, should prove to be an excellent group test for 
measuring the mental ability of adults. One of its most useful applica- 
tions will be in the initial classification of persons who are committed 
to mental and penal institutions. It is also recommended for use with 
illiterates, with persons who do not speak or read English, or with others 
for whom a verbal test is not considered suitable. It can be considered 
a satisfactory verifying examination in connection with other more 
verbal paper group tests or individually administered mental exami- 
nations. 

5. When reporting results on Revised Beta Examinations, ad- 
ministered and scored according to this new standardization, the proper 
term to use is “Beta IQ.” 


Received October 14, 1946. 
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N OR C Interview Department. Interviewing for N ORC. Denver: 
National Opinion Research Center, 1945. Pp. + 154. $2.00. 


From the time of the first public opinion survey by a nation-wide 
interviewing staff in 1935, critics have pointed out that the results of such 
polls are frequently impaired by the bias, dishonesty or carelessness of 
interviewers. The National Opinion Research Center has recognized the 
importance of the training problem implied in this criticism, and this book, 
written by the N O R C Interview Department as a manual for the 200 
part-time interviewers who work for N O R C in all parts of the country, 
represents a painstaking effort to solve this problem. 

Throughout the volume an attempt is made to give the part-time 
interviewer a complete picture of the various stages a N O R C survey 
must go through before and after the actual interviewing job is done. 
The interviewer is not merely given detailed rules and suggestions, but 
he is told specifically how his errors may cause difficulty for the central 
staff at Denver or New York. The interviewer is also told quite frankly 
what criteria are used in rating his ability. For instance, performance 
on free-answer questions is described as one of the principal measures of 
ability. An interviewer who records a great variety of comments in 
language which seems to mirror his cross section appropriately is con- 
sidered superior to the interviewer who reports only brief stereotyped re- 
marks using his own word-patterns over and over again. By taking the 
part-time interviewer into their confindence in this manner, the authors 
have attempted to make the interviewer feel that he is an integral part 
of the organization. 

Although the volume deals specificaily with N O R C problems, some 
sections are of general interest to anyone who is engaged in opinion poll 
interviewing. This is especially true of the section called “How to Get 
a Good Interview.” The authors have attempted the very difficult task 
of teaching an interviewing technique which is objective, standardized, 
free from interviewer bias, and at the same time, informal and capable of 
securing more than superficial responses. Interviewers are instructed to 
accept “‘don’t knows” and “qualified answers’ only after probing for a 
more definite answer. Standardized probing is to be done primarily by 
repeating the question verbatim. Important words can be given a 
stronger emphasis or phrases which do not change the question meaning, 
such as “On the whole” or “Well, in general’ can be prefaced to the 


question. 
659 
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Other sections of the book dealing with such problems as quota con- 
trols, factual data, rural interviewing and telegraphic polls do not offer 
much information of general interest, although the material is presented 
in a readable manner. 
¢ ' To the best knowledge of the reviewer, no other similar manual for 
opinion poll interviewers is now available to the public. N OR C is to 
be commended for filling this need and for doing a workmanlike job. 


Philip H. Kriedt 





University of Minnesota 


Smith, B. L., Lasswell, H. D., and Casey, R. D. Propaganda, com- 
munication, and public opinion. Princeton: Princeton University 
Press, 1946. Pp. 435. $5.00. 

This book is an invaluable reference book for the research applied 
psychologist. In addition to four essays on the science of mass com- 
munication, there is an annotated bibliography of 2,558 titles classified 
under seven main headings with numerous sub-headings. A total of 
150 major titles are identified as being of especial value for the scientific 
student. 

This Reference Guide is a continuation of the work by the same 
authors which was published in 1935 under the title Propaganda and 
Promotional Activities: An Annotated Bibliography. The 1935 book 
listed about 4,500 titles. The present book adds some 2,500 titles which 
have appeared for the most part since 1935. This indicates the rapid 
growth of scientific interest in the analysis of propaganda and other forms 
of mass communication. 

The applied psychologist will be especially interested in those titles 
classified under theory and measurement. A total of 321 titles are in- 
cluded in the former category and 263 in the latter. 


Donald G. Paterson 
University of Minnesota ’ 


Munroe, Ruth L. Prediction of the adjustment and academic performance 
of college students by a modification of the Rorschach Method. Applied 
Psychology Monographs, No. 7, September 1945. Pp. 104. $1.25. 
One of the facts about the Rorschach literature which is disturbing 

to those American psychologists who are interested but skeptical is the 

pitiably small number of studies which can really be called “‘validation”’ 
studies in any respectable sense, buried among a great mass of investiga- 
tions whose titles would suggest that they were systematic studies of 
validity, but which turn out not to be so at all upon reading. It is, 
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therefore, gratifying to come across this work of Dr. Munroe’s, which is 
a contribution both to college guidance and to the case for the Rorschach. 
She presents data from three successive freshman classes, totalling 348 
girls, at Sarah Lawrence College. Her aim was to determine the effici- 
ency of the Rorschach in predicting ‘‘adjustment”’ and academic achieve- 
ment. The Rorschachs for the first 100 cases were given in the usual 
manner, but scored by Munroe’s “Inspection Method,” and 60 of them 
were administered by someone else and scored later by the author. 
During the second two years of study the Harrower Group Rorschach was 
given and scored also by the Inspection Method. The “validating” 
criteria include several studies utilizing short “blind” personality sketches 
and the method of correct matchings, previously reported in part; 
academic achievement as indicated by scholarship ratings of the Student 
Work Committee (letter grades are not given at Sarah Lawrence); ad- 
justment ratings as indicated by faculty consultations and referrals to 
the psychiatrist; and a rating of maladjustment by the Student Work 
Committee. 

Because of the nature of the criteria available, data are not presented 
in correlational form, but are expressed in terms of contingency tables. In 
one respect this turns out to be a decided advantage, since it brings out 
certain relationships which a Pearson r as ordinarily employed would 
possibly obscure. For example, the Rorschach used alone is about 
equally efficient in predicting academic success as the ACE used alone. 
Contingency coefficients are .43 and .36 respectively. Since there is 
practically no relation between the two, combining them improves 
prediction slightly, as shown by a contingency coefficient of .50. It is 
noted that the ACE is more effective than the Rorschach in predicting 
definitely superior work, whereas students doing poor work in spite of 
high ACE scores tend to have ‘‘maladjusted”’ Rorschachs. These are, 
of course, the results that one might expect theoretically. 

The study is unfortunately marred by a few minor defects and devia- 
tions from perfect control which leave a way out for anyone who is 
adamantly skeptical about the Rorschach. For example, it would have 
been better had all of the tests been scored completely “blind,” without 
any opportunity for clinical impressions to be formed in face-to-face 
contact. The author states that the results on those cases she tested 
personally were not “significantly” better than the others. The third 
freshman class might well have been excluded to increase the purity of 
design, since their records were filed and available to teachers before the 
year’s close. Here, however, Dr. Munroe states that the results for the 
third year were not significantly better than those of the first two. 
One would like to have the data separately analysed for the middle year, 
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when the most rigid control was exercised. On the whole, however, the 
study could be a pretty good model for other Rorschachers, and it cer- 
tainly is an important contribution to Rorschach literature. 


Paul E. Meehl 
University of Minnesota 


Rapaport, David (with the collaboration of Gill, Merton, and Schafer, 
Roy). Diagnostic psychological testing. Chicago: The Year Book 
Publishers, Inc., 1946. Vols. I and II, pp. xxii + 1098. $13.00. 
These volumes summarize an extensive and systematic investigation 

of the differential diagnostic potentialities of a psychometric battery 

consisting of the Wechsler-Bellevue Adult Intelligence Test and the 

Babcock Deterioration Test as tests of intelligence; the Object Sorting 

Test and the Hanfmann-Kasanin Test as tests of concept formation; and 

the Word Association Test, the Rorschach Test, and the Thematic Ap- 

perception Test as tests of ideational content and personality. Fifty- 
four “normal” control cases, selected at random from the Kansas High- 
way Patrol, and 217 psychiatric cases, apparently from the Menninger 

Clinic, serve as subjects. On the basis of psychiatric and social history 

data, the control group is divided into three sections according to ex- 

cellence of adjustment. Similar but more extensive data provide for the 
categorization of the clinical cases as “schizophrenic,” ‘‘preschizophrenic,”’ 

“paranoid condition,” “depressive,” or “neurotic,” a finer classification 

being employed as the needs of analysis dictate. 

A substantial portion of the text is devoted to development of a psy- 
chological rationale for each test, along with supplementary validation 
data for some of the tests. Considerable space is given to discussion of 
the general application of those of the tests which hitherto have been 
credited with only a specialized and restricted utility. Test by test 
statistical and clinical comparisons between the control and clinical sub- 
groups reveal the diagnostic potentialities of the battery. The numer- 
ous psychometric patterns and indicators of diagnostic import are dis- 
cussed and described in some detail. The appendices include a review of 
pertinent literature on each test as well as the test scores and history data 
for each subject. 

This research encompasses much crucial but essentially unexplored 
or controversy ridden territory in clinical psychology. The authors are 
both free and ingenious in proposing and provisionally substantiating 
hypotheses in their development and discussion of test rationales. These 
hypotheses and rationales are so numerous and extensive that they can- 
not be discussed adequately in this brief review, however. 

Prolixity in the text, lack of illustrative clarity in the graphs, and 
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occasional reliance on verbal argument when statistical argument might 
have been more lucid and forceful weakens the presentation at times. 
In some instances verbal recapitulation of statistical tables unduly 
lengthens the text. Limitation of the statistical argument largely to 
the “‘t”’ and “Chi Squared” tests of significance may have reduced some- 
what the fecundity and practical utility of the analysis, possibly forcing 
the authors to resort, for purposes of argument, to the less rigorous 
clinical impressions and verbal analyses more often than was strictly 
necessary. 

The authors present no summarizing statistics on the differential 
diagnostic accuracy of the battery as a whole. One of their aims was 
“to show how the. . . tests . . . were welded in (their) clinical work into 
a single diagnostic tool.’”” This omission is thus glaring and disappoint- 
ing, the more so because of the crucial character of such a summary. 

These volumes possess redeeming features which outweigh their 
defects, however. The wealth of clinical information they contain as 
well as the stimulating speculations and informative discussions concern- 
ing test validation, problems encountered in the clinical use of the tests, 
and possible psychopathology underlying various forms of impairment 
in test performance should make them worthwhile aids to clinical psy- 
chologists. Both the clinical and the heuristic value of these books 
more than justifies their inclusion in class reading lists and clinic libraries, 
though they can hardly be considered as text material for other than 
advanced courses. 


Howarp F. Hunt 
Stanford University 


Hayes, Samuel P. Vocational aptitude tests for the blind. Perkins Insti- 
tution and Massachusetts School for the Blind, Watertown 72, 
Massachusetts, 1946. Pp. 32. 25 cents. 


In a small readable book written in non-technical language, Hayes 
briefly surveys the attempts to develop vocational aptitude tests for the 
blind. The approach is historical, and gives quiet testimony to the 
prominent part played by the author in the development of psycho- 
logical tests for this group. 

The scope of the treatment is indicated by the opening definition of 
vocational aptitude tests as those “designed to measure the special abili- 
ties needed for success in some specific occupation.”” Hayes intention- 
ally excludes not only general intelligence and achievement tests which 
he has discussed in numerous other articles, but also personality ‘“‘tests,”’ 
which need little change for use with the blind except their transformation 
into braille. The survey encompasses psychological measurements under 
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the classifications of mazes and formboards, musical, manual and me- 
chanical, and scholastic aptitude tests. 

The book is not a test administrator’s manual but a description of 
the research efforts made to date within the scope of the given definition. 
At appropriate points in the book the author makes thoughtful appraisals 
based upon his experiments and experiences. 

The dearth of research in this field is indicated by the inclusion of 
twenty titles in the bibliography. Fifteen of these are specifically on the 
blind. Only two of the articles report efforts to validate the tests against 
job performance, Bauman’s work representing the sole creditable attempt. 

Those psychologists, educators, and counselors who take sides on the 
question of the importance of tests in the guidance of the blind will find 
further material for discussion in this book. Hayes reports that Herbert 
Moore, after using a few unstandardized tests on the blind pupils of 
several residential schools in 1935, concluded that “measurements of 
tactual and motor aptitude must occupy a relatively subordinate place 
in student guidance.’”’ Bauman’s more recent experience in a guidance 
clinic convinced her that “Tests can be of great value in the selection of 
the job in which a chosen individual shall be placed.” 

Hayes did not choose to include in this book an account of the efforts 
made to date to adapt interest and personality inventories for the blind. 
However, surveys on intelligence and achievement tests, and-on person- 
ality inventories have been covered in other publications of the author. 
His most recent review was given in a paper presented at the 37th Con- 
vention of the American Association of Instructors of the Blind in 1944, 
under the title, ‘‘What’s New In Testing the Blind?” 

This newest book by an experimenter who has worked almost alone 
for three decades in a psychology of the blind ought to be read by all 
professional persons interested in the education, guidance and vocational 
rehabilitation of the blind. Clinical and Industrial Psychologists will 
learn from this book where they may find source information—first, on 
the experimental adaptations for the blind of some tests now widely used 
with the sighted, secondly, on the literature which describes past attempts 
to construct special tests for the blind. Research workers interested in 
the construction of a battery of vocational aptitude tests for the visually 
impaired will find excellent leads for further study. Graduate students 
should have little trouble in locating a wealth of “problems” that have 
both academic and practical significance. 

The reader may be disconcerted to learn that so few have turned their 
energies to the vocational testing of the blind. It is hoped that the book 
will stimulate more systematic, widespread and coordinated endeavors 
to conduct studies in this field. Such investigations will make contribu- 
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tions not only to theoretical knowledge but also to human and social 
values. A psychology of the blind gives fresh insight into the psychology 
of all human personality. 


SALVATORE G. D1MIcHAEL 
Office of Vocational Rehabilitation, 
Washington, D. C. 


Lowenfeld, Berthold. Braille and talking book reading: a comparative 
study. New York: American Foundation for the Blind, 1945. Pp. 53. 


This monograph describes a scholarly attempt to determine the prac- 
tical usefulness of a new educational device for the visually handicapped. 

For blind adults the talking book is an unqualified success—a gift 
from heaven. Although invented less than fifteen years ago, there are 
already nearly 30,000 machines in use, furnishing recreational reading for 
many who have never mastered the slow and difficult process of reading 
with the fingers, while widening the intellectual horizon for those who can 
read braille. Educators of blind children are debating the wisdom of an 
extensive use of the device in school work. Is there a danger that chil- 
dren might refuse to learn to read braille? Will their spelling suffer? 
Certainly a wider acquaintance with literature, social studies and science 
would result, compensating for the limitations imposed by finger reading, 
which averages only about one-third as fast as reading with the eyes. 
‘And the superior pronunciation of the professional readers whose voices 
come from the records might promote good speaking habits in the 
listeners. 

The experiments reported in this monograph were planned to compare 
the speed and comprehension of talking book reading with braille read- 
ing, and incidentally to throw some light upon the children’s own prefer- 
ences in talking book material. 

Carefully controlled experiments were made in which the McCall- 
Crabbs Standard Test Lessons in Reading were presented to 481 pupils 
in grades three, four, six and seven in twelve residential schools for the 
blind. The material was given in braille and by the use of three different 
kinds of talking book records—simple readings, readings with sound 
effects dubbed in from sound effect records, and readings with dramatiza- 
tions performed by experienced actors. The tests included stories and 
factual textbook material. Comprehension was measured by multiple 
choice questions phrased to test understanding rather than mere rote 
memory. After the tests had been completed the children were asked to 
list the four stories they liked best in order of preference. 

Wide differences in rate of braille reading were found in the different 
schools tested, but for the whole group studied, the rate for braille reading 
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was only about one-third the rate at which the material was presented on 
the talking book. If comprehension is satisfactory a clear case would 
seem to have been made for the use of the talking book, since three times 
as much material would be covered. 

At the third and fourth grade level, comprehension of straight talking 
book reading is significantly superior to braille reading. At the sixth 
and seventh grade level no significant difference in comprehension was 
noted for stories, while comprehension of textbook material was signifi- 
cantly better when presented in braille. The author suggests that this 
superiority of braille reading may be explained by “past practice and 
habituation” and that practice with the talking book and the develop- 
ment of purposive listening techniques might well change the relative 
success of the two methods. Sound effects and dramatization added 
greatly to the children’s pleasure but did not improve their comprehen- 
sion scores. 

The following recommendations of the author seem fully justified by 
the results of the experiments: 


1. “Since pupils on the third and fourth grade level read about three 
times as fast by the talking book as in braille and since comprehension of 
talking book reading is superior to that of braille reading, the use of the 
talking book at this level is strongly recommended in order to compensate 
at least in part for the slowness of braille reading.” 

2. “Sound effects and dramatizations used in connection with talking 
book reading are an attractive feature, the use of which is suggested to 
stimulate reading interest in blind pupils.” 

3. “On the sixth and seventh grade level where pupils have acquired 
some proficiency in braille reading, the use of the talking book is recom- 
mended because its rate, which at this level is about two and a half times 
as fast as that of braille reading, will permit much wider reading.” But 
the author suggests that informational material for which the fullest 
possible comprehension is essential be read in braille. 

4. “Many pupils of lower intelligence never achieve any reai profi- 
ciency in braille reading in spite of long and laborious instruction and 
practice. The use of the talking book is particularly recommended for 
these pupils.” 

SAMUEL P. Hayes 

Perkins Institution and 

Mass. School for the Blind 
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Erratum 


In the October 1946 issue of the Journal of Applied Psychology, an 
error occurred in the article, “Age of Starting to Contribute versus 
Total Creative Output” by Harvey C. Lehman. On page 466, line 13 
read, “important contributions as late as age 20,” whereas it should have 
read “important contributions as late as age 30.” 
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